Discussion:
[PATCH] drivers/char/mem.c: Add /dev/ioports, supporting 16-bit and 32-bit ports
(too old to reply)
Josh Triplett
2014-05-09 19:19:31 UTC
Permalink
/dev/port only supports reading and writing 8-bit ports; multi-byte
operations on /dev/port will just operate on multiple successive 8-bit
ports.

Add a new device, /dev/ioports, which supports reading and writing
16-bit and 32-bit ports. This makes it possible to perform arbitrary
I/O port operations cleanly from privileged userspace processes, without
using iopl or ioperm.

Signed-off-by: Josh Triplett <***@joshtriplett.org>
---

Written after encountering yet another out-of-tree omnibus "do arbitrary
I/O for test purposes" driver; this one's main reason for existence was
the inability to operate on 16-bit and 32-bit I/O ports. Let's get a
proper interface into the kernel, to make such drivers obsolete.

I've also written a corresponding manpage patch, which I'll submit as a
reply to this one.

drivers/char/mem.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 77 insertions(+), 2 deletions(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 917403f..84e0526 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -35,6 +35,7 @@
#endif

#define DEVPORT_MINOR 4
+#define DEVIOPORTS_MINOR 12

static inline unsigned long size_inside_page(unsigned long start,
unsigned long size)
@@ -584,6 +585,69 @@ static ssize_t write_port(struct file *file, const char __user *buf,
*ppos = i;
return tmp-buf;
}
+
+static ssize_t read_ioports(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ unsigned long port = *ppos;
+
+ if (!access_ok(VERIFY_WRITE, buf, count))
+ return -EFAULT;
+ if (port > 65535)
+ return 0;
+ switch (count) {
+ case 1:
+ if (__put_user(inb(port), buf) < 0)
+ return -EFAULT;
+ return 1;
+ case 2:
+ if (__put_user(inw(port), buf) < 0)
+ return -EFAULT;
+ return 2;
+ case 4:
+ if (__put_user(inl(port), buf) < 0)
+ return -EFAULT;
+ return 4;
+ default:
+ return -EINVAL;
+ }
+}
+
+static ssize_t write_ioports(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ unsigned long port = *ppos;
+
+ if (!access_ok(VERIFY_READ, buf, count))
+ return -EFAULT;
+ if (port > 65535)
+ return 0;
+ switch (count) {
+ case 1: {
+ u8 b;
+ if (__get_user(b, buf))
+ return -EFAULT;
+ outb(b, port);
+ return 1;
+ }
+ case 2: {
+ u16 w;
+ if (__get_user(w, buf))
+ return -EFAULT;
+ outw(w, port);
+ return 2;
+ }
+ case 4: {
+ u32 l;
+ if (__get_user(l, buf))
+ return -EFAULT;
+ outl(l, port);
+ return 4;
+ }
+ default:
+ return -EINVAL;
+ }
+}
#endif

static ssize_t read_null(struct file *file, char __user *buf,
@@ -779,6 +843,13 @@ static const struct file_operations port_fops = {
.write = write_port,
.open = open_port,
};
+
+static const struct file_operations ioports_fops = {
+ .llseek = memory_lseek,
+ .read = read_ioports,
+ .write = write_ioports,
+ .open = open_port,
+};
#endif

static const struct file_operations zero_fops = {
@@ -827,6 +898,9 @@ static const struct memdev {
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, NULL },
#endif
+#ifdef CONFIG_DEVPORT
+ [12] = { "ioports", 0, &ioports_fops, NULL },
+#endif
};

static int memory_open(struct inode *inode, struct file *filp)
@@ -892,9 +966,10 @@ static int __init chr_dev_init(void)
continue;

/*
- * Create /dev/port?
+ * Create /dev/port and /dev/ioports?
*/
- if ((minor == DEVPORT_MINOR) && !arch_has_dev_port())
+ if ((minor == DEVPORT_MINOR || minor == DEVIOPORTS_MINOR)
+ && !arch_has_dev_port())
continue;

device_create(mem_class, NULL, MKDEV(MEM_MAJOR, minor),
--
2.0.0.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Josh Triplett
2014-05-09 19:21:40 UTC
Permalink
/dev/ioports works like /dev/port, but supports 16-bit and 32-bit ports.

Signed-off-by: Josh Triplett <***@joshtriplett.org>
---

Wasn't sure whether I should update the "Modified" notice at the top;
ideally such things should go away in favor of git.

man4/ioports.4 | 1 +
man4/mem.4 | 21 +++++++++++++++++++--
2 files changed, 20 insertions(+), 2 deletions(-)
create mode 100644 man4/ioports.4

diff --git a/man4/ioports.4 b/man4/ioports.4
new file mode 100644
index 0000000..d4c1762
--- /dev/null
+++ b/man4/ioports.4
@@ -0,0 +1 @@
+.so man4/mem.4
diff --git a/man4/mem.4 b/man4/mem.4
index 74b7b89..adfd810 100644
--- a/man4/mem.4
+++ b/man4/mem.4
@@ -25,7 +25,7 @@
.\" Modified Sat Jul 24 16:59:10 1993 by Rik Faith (***@cs.unc.edu)
.TH MEM 4 1992-11-21 "Linux" "Linux Programmer's Manual"
.SH NAME
-mem, kmem, port \- system memory, kernel memory and system ports
+mem, kmem, port, ioports \- system memory, kernel memory and system ports
.SH DESCRIPTION
.B mem
is a character device file
@@ -66,7 +66,8 @@ chown root:kmem /dev/kmem
.B port
is similar to
.BR mem ,
-but the I/O ports are accessed.
+but the I/O ports are accessed. Multi-byte reads or writes to port will
+operate on successive byte-sized ports.
.LP
It is typically created by:
.RS
@@ -75,12 +76,28 @@ mknod \-m 660 /dev/port c 1 4
.br
chown root:mem /dev/port
.RE
+.LP
+.B ioports
+is similar to
+.BR port ,
+but supports reads and writes of 16-bit or 32-bit ports. Reads or writes to
+ioport with sizes other than 1, 2, or 4 will return EINVAL.
+.LP
+It is typically created by:
+.RS
+.sp
+mknod \-m 660 /dev/ioports c 1 12
+.br
+chown root:mem /dev/ioports
+.RE
.SH FILES
.I /dev/mem
.br
.I /dev/kmem
.br
.I /dev/port
+.br
+.I /dev/ioports
.SH SEE ALSO
.BR chown (1),
.BR mknod (1),
--
2.0.0.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Michael Kerrisk (man-pages)
2014-05-13 08:27:43 UTC
Permalink
Hi Josh,
Post by Josh Triplett
/dev/ioports works like /dev/port, but supports 16-bit and 32-bit ports.
---
Wasn't sure whether I should update the "Modified" notice at the top;
ideally such things should go away in favor of git.
Mostly they do go away with Git, but I do add Copyright notices when
the changes to the page are substantial (as here).

Could you ping me again if/when this looks like heading to mainline?

Thanks,

Michael
Post by Josh Triplett
man4/ioports.4 | 1 +
man4/mem.4 | 21 +++++++++++++++++++--
2 files changed, 20 insertions(+), 2 deletions(-)
create mode 100644 man4/ioports.4
diff --git a/man4/ioports.4 b/man4/ioports.4
new file mode 100644
index 0000000..d4c1762
--- /dev/null
+++ b/man4/ioports.4
@@ -0,0 +1 @@
+.so man4/mem.4
diff --git a/man4/mem.4 b/man4/mem.4
index 74b7b89..adfd810 100644
--- a/man4/mem.4
+++ b/man4/mem.4
@@ -25,7 +25,7 @@
.TH MEM 4 1992-11-21 "Linux" "Linux Programmer's Manual"
.SH NAME
-mem, kmem, port \- system memory, kernel memory and system ports
+mem, kmem, port, ioports \- system memory, kernel memory and system ports
.SH DESCRIPTION
.B mem
is a character device file
@@ -66,7 +66,8 @@ chown root:kmem /dev/kmem
.B port
is similar to
.BR mem ,
-but the I/O ports are accessed.
+but the I/O ports are accessed. Multi-byte reads or writes to port will
+operate on successive byte-sized ports.
.LP
.RS
@@ -75,12 +76,28 @@ mknod \-m 660 /dev/port c 1 4
.br
chown root:mem /dev/port
.RE
+.LP
+.B ioports
+is similar to
+.BR port ,
+but supports reads and writes of 16-bit or 32-bit ports. Reads or writes to
+ioport with sizes other than 1, 2, or 4 will return EINVAL.
+.LP
+.RS
+.sp
+mknod \-m 660 /dev/ioports c 1 12
+.br
+chown root:mem /dev/ioports
+.RE
.SH FILES
.I /dev/mem
.br
.I /dev/kmem
.br
.I /dev/port
+.br
+.I /dev/ioports
.SH SEE ALSO
.BR chown (1),
.BR mknod (1),
--
2.0.0.rc2
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-05-09 19:58:53 UTC
Permalink
Post by Josh Triplett
+ if (!access_ok(VERIFY_WRITE, buf, count))
+ return -EFAULT;
+ if (port > 65535)
+ return 0;
This should probably test against IO_SPACE_LIMIT, which may be zero,
something larger than 65536 or even ULONG_MAX, depending on the
architecture.

In cases where this IO_SPACE_LIMIT is zero or ULONG_MAX, we should
probably disallow access completely. The former case is for architectures
that don't have any I/O ports, the other is either a mistake, or is
used when inb is defined as readb, and the port numbers are just virtual
addresses.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-05-09 20:54:40 UTC
Permalink
Post by Arnd Bergmann
Post by Josh Triplett
+ if (!access_ok(VERIFY_WRITE, buf, count))
+ return -EFAULT;
+ if (port > 65535)
+ return 0;
This should probably test against IO_SPACE_LIMIT, which may be zero,
something larger than 65536 or even ULONG_MAX, depending on the
architecture.
In cases where this IO_SPACE_LIMIT is zero or ULONG_MAX, we should
probably disallow access completely. The former case is for architectures
that don't have any I/O ports, the other is either a mistake, or is
used when inb is defined as readb, and the port numbers are just virtual
addresses.
PCI supports a 32-bit I/O address space, so if the architecture permits
it, having a 32-bit I/O space is perfectly legitimate.

It is worth noting that /dev/port has the same problem.

However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.

Also, x86 supports unaligned I/O port references, but not all
architectures do. On the other hand, x86 also supports ioperm().

-hpa



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-05-09 21:14:52 UTC
Permalink
Post by H. Peter Anvin
Post by Arnd Bergmann
+ if (!access_ok(VERIFY_WRITE, buf, count))
+ return -EFAULT;
+ if (port > 65535)
+ return 0;
This should probably test against IO_SPACE_LIMIT, which may be zero,
something larger than 65536 or even ULONG_MAX, depending on the
architecture.
In cases where this IO_SPACE_LIMIT is zero or ULONG_MAX, we should
probably disallow access completely. The former case is for architectures
that don't have any I/O ports, the other is either a mistake, or is
used when inb is defined as readb, and the port numbers are just virtual
addresses.
PCI supports a 32-bit I/O address space, so if the architecture permits
it, having a 32-bit I/O space is perfectly legitimate.
Right, but on all 32-bit architectures other than x86, the I/O ports
are mapped into physical memory addresses, which means you can't
map all of the I/O space into the CPU address space. On 64-bit
architectures you can, but then it's UINT_MAX, not ULONG_MAX.

There is also the theoretical case of machines mapping a window
of I/O addresses with portnumber==phys_addr as we normally do
for PCI memory space, but I haven't seen anyone actually do that.
Practically every PCI implementation (if they have I/O space at all)
maps a small number of ports (65536 or 1048576 mostly) starting at
port number zero to a fixed physical CPU address.
Post by H. Peter Anvin
It is worth noting that /dev/port has the same problem.
Right. We should fix that, too.
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-05-09 21:21:12 UTC
Permalink
Post by Arnd Bergmann
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.
Indeed. I missed the detail that it increments the port index; it is
virtually guaranteed to be bogus.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Josh Triplett
2014-05-09 22:38:54 UTC
Permalink
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.
Indeed. I missed the detail that it increments the port index; it is
virtually guaranteed to be bogus.
Exactly. It might make sense to have ioport8/ioport16/ioport32 devices
that accept arbitrary-length reads and writes (divisible by the size)
and do the equivalent of the string I/O instructions outs/ins, but for
the moment I'd like to add the single device that people always seem to
want and can't get from /dev/port. If someone's doing enough writes
that doing a syscall per in/out instruction seems like too much
overhead, they can write a real device driver or use ioperm/iopl.

- Josh triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-05-13 22:11:39 UTC
Permalink
Post by Josh Triplett
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.
Indeed. I missed the detail that it increments the port index; it is
virtually guaranteed to be bogus.
Exactly. It might make sense to have ioport8/ioport16/ioport32 devices
that accept arbitrary-length reads and writes (divisible by the size)
and do the equivalent of the string I/O instructions outs/ins, but for
the moment I'd like to add the single device that people always seem to
want and can't get from /dev/port. If someone's doing enough writes
that doing a syscall per in/out instruction seems like too much
overhead, they can write a real device driver or use ioperm/iopl.
I really have a problem with the logic "our current interface is wrong,
so let's introduce another wrong interface which solves a narrow use
case". In some ways it would actually be *better* to use an ioctl
interface on /dev/port in that case...

-hpa



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
j***@joshtriplett.org
2014-05-15 21:56:59 UTC
Permalink
Post by H. Peter Anvin
Post by Josh Triplett
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.
Indeed. I missed the detail that it increments the port index; it is
virtually guaranteed to be bogus.
Exactly. It might make sense to have ioport8/ioport16/ioport32 devices
that accept arbitrary-length reads and writes (divisible by the size)
and do the equivalent of the string I/O instructions outs/ins, but for
the moment I'd like to add the single device that people always seem to
want and can't get from /dev/port. If someone's doing enough writes
that doing a syscall per in/out instruction seems like too much
overhead, they can write a real device driver or use ioperm/iopl.
I really have a problem with the logic "our current interface is wrong,
so let's introduce another wrong interface which solves a narrow use
case". In some ways it would actually be *better* to use an ioctl
interface on /dev/port in that case...
ioport{8,16,32} seems preferable to an ioctl on /dev/port, but in any
case, I'd be happy to adapt this patch to whatever interface seems
preferable. I just don't want to let the perfect be the enemy of the
good here; 16-bit and 32-bit port operations are currently completely
impossible via /dev/port, and I'm primarily interested in fixing that,
not necessarily in creating a completely generalized interface for doing
high-performance repeated I/O operations that ought to be in the kernel
anyway.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-05-19 12:37:43 UTC
Permalink
Post by j***@joshtriplett.org
Post by H. Peter Anvin
Post by Josh Triplett
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
However, if we're going to have these devices I'm wondering if having
/dev/portw and /dev/portl (or something like that) might not make sense,
rather than requiring a system call per transaction.
Actually the behavior of /dev/port for >1 byte writes seems questionable
already: There are very few devices on which writing to consecutive
port numbers makes sense. Normally you just want to write a series
of bytes (or 16/32 bit words) into the same port number instead,
as the outsb()/outsw()/outsl() functions do.
Indeed. I missed the detail that it increments the port index; it is
virtually guaranteed to be bogus.
Exactly. It might make sense to have ioport8/ioport16/ioport32 devices
that accept arbitrary-length reads and writes (divisible by the size)
and do the equivalent of the string I/O instructions outs/ins, but for
the moment I'd like to add the single device that people always seem to
want and can't get from /dev/port. If someone's doing enough writes
that doing a syscall per in/out instruction seems like too much
overhead, they can write a real device driver or use ioperm/iopl.
I really have a problem with the logic "our current interface is wrong,
so let's introduce another wrong interface which solves a narrow use
case". In some ways it would actually be *better* to use an ioctl
interface on /dev/port in that case...
ioport{8,16,32} seems preferable to an ioctl on /dev/port, but in any
case, I'd be happy to adapt this patch to whatever interface seems
preferable. I just don't want to let the perfect be the enemy of the
good here; 16-bit and 32-bit port operations are currently completely
impossible via /dev/port, and I'm primarily interested in fixing that,
not necessarily in creating a completely generalized interface for doing
high-performance repeated I/O operations that ought to be in the kernel
anyway.
I'd prefer to do the absolute minimum to enable this: port I/O is not
something that people really should be doing in this decade, at least
not on non-x86.

A simple pair of ioctls would be fine in my mind, if we think there
is actually a strong use case. Josh, where did you find that testing
driver? Do you have reason to believe it's actually useful on non-x86?
If the reason for the existence of that code is just that someone
didn't find the iopl() man page, I'd rather not have it at all.

My feeling is that all devices we can think of fall into at least one
of these categories:

* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-05-28 21:42:21 UTC
Permalink
Post by Arnd Bergmann
My feeling is that all devices we can think of fall into at least one
* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl
I don't believe PCI I/O space devices can be accessed through sysfs, but
perhaps I'm wrong? (mmapping I/O space is not portable.)


-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-05-29 09:26:56 UTC
Permalink
Post by H. Peter Anvin
Post by Arnd Bergmann
My feeling is that all devices we can think of fall into at least one
* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl
I don't believe PCI I/O space devices can be accessed through sysfs, but
perhaps I'm wrong? (mmapping I/O space is not portable.)
The interface is there, both a read/write and mmap on the resource
bin_attribute. But it seems you're right, neither of them is implemented
on all architectures.

Only powerpc, microblaze, alpha, sparc and xtensa allow users to mmap
I/O space, even though a lot of others could. The read-write interface
is only defined for alpha, ia64, microblaze and powerpc.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-05-29 13:39:10 UTC
Permalink
Post by Arnd Bergmann
Post by H. Peter Anvin
Post by Arnd Bergmann
My feeling is that all devices we can think of fall into at least one
* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl
I don't believe PCI I/O space devices can be accessed through sysfs, but
perhaps I'm wrong? (mmapping I/O space is not portable.)
The interface is there, both a read/write and mmap on the resource
bin_attribute. But it seems you're right, neither of them is implemented
on all architectures.
Only powerpc, microblaze, alpha, sparc and xtensa allow users to mmap
I/O space, even though a lot of others could. The read-write interface
is only defined for alpha, ia64, microblaze and powerpc.
And how is that read/write interface defined? Does it have the same
silly handling of data sizes?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2014-05-30 11:34:20 UTC
Permalink
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
Post by Arnd Bergmann
My feeling is that all devices we can think of fall into at least one
* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl
I don't believe PCI I/O space devices can be accessed through sysfs, but
perhaps I'm wrong? (mmapping I/O space is not portable.)
The interface is there, both a read/write and mmap on the resource
bin_attribute. But it seems you're right, neither of them is implemented
on all architectures.
Only powerpc, microblaze, alpha, sparc and xtensa allow users to mmap
I/O space, even though a lot of others could. The read-write interface
is only defined for alpha, ia64, microblaze and powerpc.
And how is that read/write interface defined? Does it have the same
silly handling of data sizes?
In architecture specific code, e.g. for powerpc:

int pci_legacy_read(struct pci_bus *bus, loff_t port, u32 *val, size_t size)
{
unsigned long offset;
struct pci_controller *hose = pci_bus_to_host(bus);
struct resource *rp = &hose->io_resource;
void __iomem *addr;

/* Check if port can be supported by that bus. We only check
* the ranges of the PHB though, not the bus itself as the rules
* for forwarding legacy cycles down bridges are not our problem
* here. So if the host bridge supports it, we do it.
*/
offset = (unsigned long)hose->io_base_virt - _IO_BASE;
offset += port;

if (!(rp->flags & IORESOURCE_IO))
return -ENXIO;
if (offset < rp->start || (offset + size) > rp->end)
return -ENXIO;
addr = hose->io_base_virt + port;

switch(size) {
case 1:
*((u8 *)val) = in_8(addr);
return 1;
case 2:
if (port & 1)
return -EINVAL;
*((u16 *)val) = in_le16(addr);
return 2;
case 4:
if (port & 3)
return -EINVAL;
*((u32 *)val) = in_le32(addr);
return 4;
}
return -EINVAL;
}

The common code already enforces size to be 1, 2 or 4.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-22 10:52:37 UTC
Permalink
Post by Arnd Bergmann
Post by H. Peter Anvin
Post by Arnd Bergmann
Post by H. Peter Anvin
Post by Arnd Bergmann
My feeling is that all devices we can think of fall into at least one
* legacy PC stuff that needs only byte access
* PCI devices that can be accessed through sysfs
* devices on x86 that can be accessed using iopl
I don't believe PCI I/O space devices can be accessed through sysfs, but
perhaps I'm wrong? (mmapping I/O space is not portable.)
The interface is there, both a read/write and mmap on the resource
bin_attribute. But it seems you're right, neither of them is implemented
on all architectures.
Only powerpc, microblaze, alpha, sparc and xtensa allow users to mmap
I/O space, even though a lot of others could. The read-write interface
is only defined for alpha, ia64, microblaze and powerpc.
And how is that read/write interface defined? Does it have the same
silly handling of data sizes?
int pci_legacy_read(struct pci_bus *bus, loff_t port, u32 *val, size_t size)
{
unsigned long offset;
struct pci_controller *hose = pci_bus_to_host(bus);
struct resource *rp = &hose->io_resource;
void __iomem *addr;
/* Check if port can be supported by that bus. We only check
* the ranges of the PHB though, not the bus itself as the rules
* for forwarding legacy cycles down bridges are not our problem
* here. So if the host bridge supports it, we do it.
*/
offset = (unsigned long)hose->io_base_virt - _IO_BASE;
offset += port;
if (!(rp->flags & IORESOURCE_IO))
return -ENXIO;
if (offset < rp->start || (offset + size) > rp->end)
return -ENXIO;
addr = hose->io_base_virt + port;
switch(size) {
*((u8 *)val) = in_8(addr);
return 1;
if (port & 1)
return -EINVAL;
*((u16 *)val) = in_le16(addr);
return 2;
if (port & 3)
return -EINVAL;
*((u32 *)val) = in_le32(addr);
return 4;
}
return -EINVAL;
}
The common code already enforces size to be 1, 2 or 4.
I have an use-case for arm/arm64 both where user-space application
access pci_io address in user-space. The use-case description: dpdk's
virtio-pmd user-space driver running inside the VM/Guest. That
virtio-pmd driver maps pci_io region to guest user-space and does pmd
driver initialization. In x86 case, pmd driver uses iopl() so to
access ioport via port api's {in, out},[b,w,l]. The problem is for
platform like arm, where kernel does not map pci_io space

file : arch/arm/kernel/bios32.c
int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
enum pci_mmap_state mmap_state, int write_combine)
{
if (mmap_state == pci_mmap_io)
return -EINVAL;
....
}

So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.

Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?

Use-case details [1].

Thanks in advance.

[1] http://dpdk.org/ml/archives/dev/2015-December/030530.html
Post by Arnd Bergmann
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2015-12-22 21:57:52 UTC
Permalink
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?

After all, you are talking about regular PCI devices, not access to
random unknown I/O port numbers.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2015-12-22 22:03:11 UTC
Permalink
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
After all, you are talking about regular PCI devices, not access to
random unknown I/O port numbers.
Arnd
On that subject, shouldn't we have common infrastructure to deal with memory mapped I/O ports in the kernel? Or do we have that now? I obviously don't pay too much attention...
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2015-12-22 22:12:54 UTC
Permalink
Post by H. Peter Anvin
On that subject, shouldn't we have common infrastructure to deal with memory
mapped I/O ports in the kernel? Or do we have that now? I obviously don't
pay too much attention...
We don't have it at the moment, though some of the code that we introduced
for arm64 is defined in common code, just not shared with anything else.

Changing other architectures over to use this is painful and gains the
architectures very little, so I doubt it is going to happen.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-23 11:34:48 UTC
Permalink
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
code snap:

if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
...

and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Post by Arnd Bergmann
After all, you are talking about regular PCI devices, not access to
random unknown I/O port numbers.
Yes, pci_iobar region.
Post by Arnd Bergmann
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2015-12-29 13:29:23 UTC
Permalink
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.

You should be able access them using the read/write interface on
the vfio device.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-29 15:54:12 UTC
Permalink
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.

Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.

Thanks.

[1] http://dpdk.org/ml/archives/dev/2015-December/030852.html
Post by Arnd Bergmann
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-29 15:55:23 UTC
Permalink
mistakenly added wrong email-id of alex, looping his correct one.
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.
Thanks.
[1] http://dpdk.org/ml/archives/dev/2015-December/030852.html
Post by Arnd Bergmann
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Arnd Bergmann
2015-12-29 16:21:35 UTC
Permalink
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-29 16:30:53 UTC
Permalink
Post by Arnd Bergmann
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?
nothing wrong, infact read/write api will still be used so to access
mmaped io pci bar at userspace. But right now vfio_pci_map() doesn't
map io pci bar in particular (i.e.. ioresource_io) so I guess need to
add that bar mapping in vfio. pl. correct me if i misunderstood
anything.
Post by Arnd Bergmann
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Alex Williamson
2015-12-29 17:31:24 UTC
Permalink
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do
more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for
arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not
acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func
vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace
application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to
propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?
nothing wrong, infact read/write api will still be used so to access
mmaped io pci bar at userspace. But right now vfio_pci_map() doesn't
vfio_pci_mmap(), the read/write accessors fully support i/o port.
Post by Santosh Shukla
map io pci bar in particular (i.e.. ioresource_io) so I guess need to
add that bar mapping in vfio. pl. correct me if i misunderstood
anything.
Maybe I misunderstood what you were asking for, it seemed like you
specifically wanted to be able to mmap i/o port space, which is
possible, just not something we can do on x86.  Maybe I should have
asked why.  The vfio API already supports read/write access to i/o port
space, so if you intend to mmap it only to use read/write on top of the
mmap, I suppose you might see some performance improvement, but not
really any new functionality.  You'd also need to deal with page size
issues since i/o port ranges are generally quite a bit smaller than the
host page size and they'd need to be mapped such that each devices does
not share a host page of i/o port space with other devices.  On x86 i/o
port space is mostly considered legacy and not a performance critical
path for most modern devices; PCI SR-IOV specifically excludes i/o port
space.  So what performance gains do you expect to see in being able to
mmap i/o port space and what hardware are you dealing with that relies
on i/o port space rather than mmio for performance?  Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2015-12-31 09:34:12 UTC
Permalink
On Tue, Dec 29, 2015 at 11:01 PM, Alex Williamson
Post by Alex Williamson
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do
more than byte
data copy to/from user-space. I tested this patch with little
modification and could able to run pmd driver for
arm/arm64 case.
Like to know how to address pci_io region mapping problem for
arm/arm64, in-case /dev/ioports approach is not
acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio
framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func
vfio_pci_map() and
it look to me that it only maps ioresource_mem pci region, pasting
if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm platform in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work on x86.
You should be able access them using the read/write interface on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace
application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86 arch so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?
nothing wrong, infact read/write api will still be used so to access
mmaped io pci bar at userspace. But right now vfio_pci_map() doesn't
vfio_pci_mmap(), the read/write accessors fully support i/o port.
(Sorry for delayed response!)
Right.
Post by Alex Williamson
Post by Santosh Shukla
map io pci bar in particular (i.e.. ioresource_io) so I guess need to
add that bar mapping in vfio. pl. correct me if i misunderstood
anything.
Maybe I misunderstood what you were asking for, it seemed like you
specifically wanted to be able to mmap i/o port space, which is
possible, just not something we can do on x86. Maybe I should have
asked why. The vfio API already supports read/write access to i/o port
Yes, I want to map io port pci space in vfio and reason for that is :
I want to access virto-net-pci device at userspace using vfio and for
that I am using vfio-noiommu latest linux-next patch. but I am not
able to mmap io port pci space in vfio because of below condition -

1)
--- user space code snippet ----
reg.index = i; // where i is {0..1} i.e.. {BAR0..BAR1} such that BAR0
= io port pci space and BAR1 = pci config space

ret = ioctl(vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
if ((reg.flags & VFIO_REGION_INFO_FLAG_MMAP) == 0) {
return err;
}
now consider i = 0 case where pci_rersource_flag set to IORESOURCE_IO

--- kernel / vfip-pci.c -------------
so vfio_pci_ioctl() wont set info.flag to VFIO_REGION_INFO_FLAG_MMAP.
And it won't set for two
1) pci_resource_flag & IORESOURCE_MEM
2) ioport size < PAZE_SIZE

The second one I addressed but first one is what I believe that need
to add support in vfio.
and Same applicable for vfio_pci_mmap() too..

This is why I am thinking to add IORESOURCE_IO space mapping support
in vfio; in particular non-x86 archs.. pl. correct my understanding in
case wrong.
Post by Alex Williamson
space, so if you intend to mmap it only to use read/write on top of the
mmap, I suppose you might see some performance improvement, but not
really any new functionality. You'd also need to deal with page size
issues since i/o port ranges are generally quite a bit smaller than the
host page size and they'd need to be mapped such that each devices does
not share a host page of i/o port space with other devices. On x86 i/o
Yes. I have taken care size < PAZE_SIZE condition.
Post by Alex Williamson
port space is mostly considered legacy and not a performance critical
path for most modern devices; PCI SR-IOV specifically excludes i/o port
space. So what performance gains do you expect to see in being able to
mmap i/o port space and what hardware are you dealing with that relies
on i/o port space rather than mmio for performance? Thanks,
dpdk user space virtio-net pmd driver uses ioport space for driver
initialization, as because virtio-net header resides in ioport area of
virtio-pxe.rom file, also it is inlined to virtio spec (<= 0.95). Till
now virtio-net dpdk pmd driver for x86 using iopl() to access those
ioport for driver initialization but for non-x86 cases; we needed
alternative i.e.. kernel to someway map ioport pci region either by
architecture example powerpc does Or look in vfio for mapping. I hope
I made my use-case clear.
Post by Alex Williamson
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Alex Williamson
2015-12-31 15:42:54 UTC
Permalink
Post by Santosh Shukla
On Tue, Dec 29, 2015 at 11:01 PM, Alex Williamson
Post by Alex Williamson
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
lina
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
.de>
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could do
more than byte
data copy to/from user-space. I tested this patch
with
little
modification and could able to run pmd driver for
arm/arm64 case.
Like to know how to address pci_io region mapping
problem for
arm/arm64, in-case /dev/ioports approach is not
acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio
framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func
vfio_pci_map() and
it look to me that it only maps ioresource_mem pci
region,
pasting
if (!(pci_resource_flags(pdev, index) &
IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm
platform
in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work
on
x86.
You should be able access them using the read/write
interface
on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86
arch
so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?
nothing wrong, infact read/write api will still be used so to access
mmaped io pci bar at userspace. But right now vfio_pci_map() doesn't
vfio_pci_mmap(), the read/write accessors fully support i/o port.
(Sorry for delayed response!)
Right.
Post by Alex Williamson
Post by Santosh Shukla
map io pci bar in particular (i.e.. ioresource_io) so I guess need to
add that bar mapping in vfio. pl. correct me if i misunderstood
anything.
Maybe I misunderstood what you were asking for, it seemed like you
specifically wanted to be able to mmap i/o port space, which is
possible, just not something we can do on x86.  Maybe I should have
asked why.  The vfio API already supports read/write access to i/o
port
I want to access virto-net-pci device at userspace using vfio and for
that I am using vfio-noiommu latest linux-next patch. but I am not
able to mmap io port pci space in vfio because of below condition -
1)
--- user space code snippet ----
reg.index = i; // where i is {0..1} i.e.. {BAR0..BAR1} such that BAR0
= io port pci space and BAR1 = pci config space
ret = ioctl(vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
if ((reg.flags & VFIO_REGION_INFO_FLAG_MMAP) == 0) {
       return err;
}
now consider i = 0 case where pci_rersource_flag set to IORESOURCE_IO
--- kernel / vfip-pci.c -------------
so vfio_pci_ioctl() wont set info.flag to VFIO_REGION_INFO_FLAG_MMAP.
And it won't set for two
1) pci_resource_flag & IORESOURCE_MEM
2) ioport size < PAZE_SIZE
The second one I addressed but first one is what I believe that need
to add support in vfio.
and Same applicable for vfio_pci_mmap() too..
This is why I am thinking to add IORESOURCE_IO space mapping support
in vfio; in particular non-x86 archs.. pl. correct my understanding in
case wrong.
Post by Alex Williamson
space, so if you intend to mmap it only to use read/write on top of the
mmap, I suppose you might see some performance improvement, but not
really any new functionality.  You'd also need to deal with page
size
issues since i/o port ranges are generally quite a bit smaller than the
host page size and they'd need to be mapped such that each devices does
not share a host page of i/o port space with other devices.  On x86
i/o
Yes. I have taken care size < PAZE_SIZE condition.
Post by Alex Williamson
port space is mostly considered legacy and not a performance
critical
path for most modern devices; PCI SR-IOV specifically excludes i/o port
space.  So what performance gains do you expect to see in being
able to
mmap i/o port space and what hardware are you dealing with that relies
on i/o port space rather than mmio for performance?  Thanks,
dpdk user space virtio-net pmd driver uses ioport space for driver
initialization, as because virtio-net header resides in ioport area of
virtio-pxe.rom file, also it is inlined to virtio spec (<= 0.95). Till
now virtio-net dpdk pmd driver for x86 using iopl() to access those
ioport for driver initialization but for non-x86 cases; we needed
alternative i.e.. kernel to someway map ioport pci region either by
architecture example powerpc does Or look in vfio for mapping. I hope
I made my use-case clear.
Not really.  I still don't understand why you need to *mmap* ioport
space rather than access it via read/write.  vfio already supports
assignment of numerous physical devices that rely on ioport space for
the device rom, device initialization, and even runtime operation in
QEMU using the accesses currently supported.  Legacy x86 ioport space
cannot be mmap'd on x86 hardware, it's only through sparse memory
mapping and emulation of ioport space provided on some architectures
that this is even possible, so you will not achieve
platform/architecture neutral support for mmap'ing ioport space, which
means that your userspace driver will not work universally if it
depends on this support.

If you were using iopl() and in*()/out*() before, simply drop the
iopl() and use pread()/pwrite() instead.  Thanks,

Alex 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Santosh Shukla
2016-01-07 09:31:45 UTC
Permalink
On Thu, Dec 31, 2015 at 9:11 PM, Alex Williamson
Post by Santosh Shukla
On Tue, Dec 29, 2015 at 11:01 PM, Alex Williamson
Post by Alex Williamson
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
mistakenly added wrong email-id of alex, looping his correct one.
lina
Post by Santosh Shukla
Post by Arnd Bergmann
Post by Santosh Shukla
.de>
Post by Arnd Bergmann
Post by Santosh Shukla
}
So I care for /dev/ioport types interface who could
do
more than byte
data copy to/from user-space. I tested this patch
with
little
modification and could able to run pmd driver for
arm/arm64 case.
Like to know how to address pci_io region mapping
problem for
arm/arm64, in-case /dev/ioports approach is not
acceptable or else I
can spent time on restructuring the patch?
For the use case you describe, can't you use the vfio
framework to
access the PCI BARs?
I looked at file: drivers/vfio/pci/vfio_pci.c, func
vfio_pci_map() and
it look to me that it only maps ioresource_mem pci
region,
pasting
if (!(pci_resource_flags(pdev, index) &
IORESOURCE_MEM))
return -EINVAL;
....
and I want to map ioresource_io pci region for arm
platform
in my
use-case. Not sure vfio maps pci_iobar region?
Mapping I/O BARs is not portable, notably it doesn't work
on
x86.
You should be able access them using the read/write
interface
on
the vfio device.
Right, x86 doesn't care as iopl() could give userspace application
direct access to ioports.
Also, Alex in other dpdk thread [1] suggested someone to propose io
bar mapping in vfio-pci, I guess in particular to non-x86
arch
so I
started working on it.
So what's wrong with just using the existing read/write API on all
architectures?
nothing wrong, infact read/write api will still be used so to access
mmaped io pci bar at userspace. But right now vfio_pci_map() doesn't
vfio_pci_mmap(), the read/write accessors fully support i/o port.
(Sorry for delayed response!)
Right.
Post by Alex Williamson
Post by Santosh Shukla
map io pci bar in particular (i.e.. ioresource_io) so I guess need to
add that bar mapping in vfio. pl. correct me if i misunderstood
anything.
Maybe I misunderstood what you were asking for, it seemed like you
specifically wanted to be able to mmap i/o port space, which is
possible, just not something we can do on x86. Maybe I should have
asked why. The vfio API already supports read/write access to i/o port
I want to access virto-net-pci device at userspace using vfio and for
that I am using vfio-noiommu latest linux-next patch. but I am not
able to mmap io port pci space in vfio because of below condition -
1)
--- user space code snippet ----
reg.index = i; // where i is {0..1} i.e.. {BAR0..BAR1} such that BAR0
= io port pci space and BAR1 = pci config space
ret = ioctl(vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
if ((reg.flags & VFIO_REGION_INFO_FLAG_MMAP) == 0) {
return err;
}
now consider i = 0 case where pci_rersource_flag set to IORESOURCE_IO
--- kernel / vfip-pci.c -------------
so vfio_pci_ioctl() wont set info.flag to VFIO_REGION_INFO_FLAG_MMAP.
And it won't set for two
1) pci_resource_flag & IORESOURCE_MEM
2) ioport size < PAZE_SIZE
The second one I addressed but first one is what I believe that need
to add support in vfio.
and Same applicable for vfio_pci_mmap() too..
This is why I am thinking to add IORESOURCE_IO space mapping support
in vfio; in particular non-x86 archs.. pl. correct my understanding in
case wrong.
Post by Alex Williamson
space, so if you intend to mmap it only to use read/write on top of the
mmap, I suppose you might see some performance improvement, but not
really any new functionality. You'd also need to deal with page size
issues since i/o port ranges are generally quite a bit smaller than the
host page size and they'd need to be mapped such that each devices does
not share a host page of i/o port space with other devices. On x86 i/o
Yes. I have taken care size < PAZE_SIZE condition.
Post by Alex Williamson
port space is mostly considered legacy and not a performance critical
path for most modern devices; PCI SR-IOV specifically excludes i/o port
space. So what performance gains do you expect to see in being able to
mmap i/o port space and what hardware are you dealing with that relies
on i/o port space rather than mmio for performance? Thanks,
dpdk user space virtio-net pmd driver uses ioport space for driver
initialization, as because virtio-net header resides in ioport area of
virtio-pxe.rom file, also it is inlined to virtio spec (<= 0.95). Till
now virtio-net dpdk pmd driver for x86 using iopl() to access those
ioport for driver initialization but for non-x86 cases; we needed
alternative i.e.. kernel to someway map ioport pci region either by
architecture example powerpc does Or look in vfio for mapping. I hope
I made my use-case clear.
Not really. I still don't understand why you need to *mmap* ioport
space rather than access it via read/write. vfio already supports
assignment of numerous physical devices that rely on ioport space for
the device rom, device initialization, and even runtime operation in
QEMU using the accesses currently supported. Legacy x86 ioport space
cannot be mmap'd on x86 hardware, it's only through sparse memory
mapping and emulation of ioport space provided on some architectures
that this is even possible, so you will not achieve
platform/architecture neutral support for mmap'ing ioport space, which
means that your userspace driver will not work universally if it
depends on this support.
If you were using iopl() and in*()/out*() before, simply drop the
iopl() and use pread()/pwrite() instead. Thanks,
Yes, It works, I got confused, Sorry for noise and thanks for helping on this!
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Greg Kroah-Hartman
2014-05-10 17:18:54 UTC
Permalink
Post by Josh Triplett
@@ -827,6 +898,9 @@ static const struct memdev {
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, NULL },
#endif
+#ifdef CONFIG_DEVPORT
+ [12] = { "ioports", 0, &ioports_fops, NULL },
Odd extra space?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Josh Triplett
2014-05-10 19:36:53 UTC
Permalink
Post by Greg Kroah-Hartman
Post by Josh Triplett
@@ -827,6 +898,9 @@ static const struct memdev {
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, NULL },
#endif
+#ifdef CONFIG_DEVPORT
+ [12] = { "ioports", 0, &ioports_fops, NULL },
Odd extra space?
Copy/paste from the "port" device, which apparently has that space to
make "[4]" and "11" line up at the ones digit. Will fix.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Josh Triplett
2014-05-10 19:33:07 UTC
Permalink
Post by Josh Triplett
+ if (port > 65535)
+ return 0;
+ switch (count) {
[...]
Post by Josh Triplett
+ if (__put_user(inl(port), buf) < 0)
+ return -EFAULT;
What if I attempt a four-byte read at 65535? That would access four
out-of-bounds bytes, right?
No, it would do an ind instruction on port 65535.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Josh Triplett
2014-05-11 21:05:38 UTC
Permalink
Post by Josh Triplett
Post by Josh Triplett
+ if (port > 65535)
+ return 0;
+ switch (count) {
[...]
Post by Josh Triplett
+ if (__put_user(inl(port), buf) < 0)
+ return -EFAULT;
What if I attempt a four-byte read at 65535? That would access four
out-of-bounds bytes, right?
No, it would do an ind instruction on port 65535.
Yes, on x86. What about other architectures?
That's a good point; on architectures that map I/O to memory, this
device should check port+count rather than port. Is there a reliable
#define that identifies architectures with that property, other than
CONFIG_X86?

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Maciej W. Rozycki
2014-06-01 10:35:27 UTC
Permalink
Post by Josh Triplett
Post by Josh Triplett
What if I attempt a four-byte read at 65535? That would access four
out-of-bounds bytes, right?
No, it would do an ind instruction on port 65535.
Yes, on x86. What about other architectures?
That's a good point; on architectures that map I/O to memory, this
device should check port+count rather than port. Is there a reliable
#define that identifies architectures with that property, other than
CONFIG_X86?
FWIW, on x86 an IND from 65535 will be split into two separate bus read
cycles, e.g. on a 32-bit data bus one at 0xfffc with only the MSB enabled
and another one with the three LSB enabled (64-bit buses will address the
first cycle at 0xfff8, etc.). Data obtained from these cycles is then
merged appropriately before writing to the destination. The address put
on the bus for the second cycle is implementation-specific as are all
accesses beyond 0xffff so it might be 0x10000 or it might be 0.

If implementing unaligned port I/O on non-x86 targets I think it would be
most reasonable to wrap the out-of-range part around back to 0 before
merging data obtained this way. The range supported can of course be
different to what x86 supports and may be specific e.g. to the PCI host
bridge (its port I/O space forwarding window size). Systems with peer
bridges may have multiple port I/O spaces too, this is one reason to have
the size of the port I/O space extended beyond 64kB; address bits from #16
up can then be used to select the intended host bridge to forward the
access to. Also IIRC PCI-PCI bridges only forward port I/O space accesses
within the low 64kB.

Most easily unaligned port I/O may not be supported at all by our kernel
device, even on x86. I think it would actually be reasonable to do, I
have yet to hear of a piece of hardware that has any use for unaligned
port I/O.

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
H. Peter Anvin
2014-06-04 22:59:50 UTC
Permalink
Post by Maciej W. Rozycki
Also IIRC PCI-PCI bridges only forward port I/O space accesses
within the low 64kB.
Not true.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Maciej W. Rozycki
2014-06-06 09:02:54 UTC
Permalink
Post by H. Peter Anvin
Post by Maciej W. Rozycki
Also IIRC PCI-PCI bridges only forward port I/O space accesses
within the low 64kB.
Not true.
It must have been an implementation-specific constraint then (e.g.
DECchip 21050), it's been a while since I looked into it. Thanks for
straightening me out.

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Loading...