Lines Matching full:the
21 is used to initiate and complete I/O using the shared submission and
29 is the file descriptor returned by
32 specifies the number of I/Os to submit from the submission queue.
34 is a bitmask of the following values:
37 If this flag is set, then the system call will wait for the specificied
45 If the ring has been created with
47 then this flag asks the kernel to wakeup the SQ kernel thread to submit IO.
50 If the ring has been created with
52 then the application has no real insight into when the SQ kernel thread has
53 consumed entries from the SQ ring. This can lead to a situation where the
55 when it one becomes available as the SQ kernel thread consumes them. If
57 one entry is free in the SQ ring.
60 Since kernel 5.11, the system calls arguments have been modified to look like
69 which is behaves just like the original definition by default. However, if
77 must be set to the size of this structure. The definition is as follows:
92 is set to a valid pointer, then this time value indicates the timeout for
98 If the ring file descriptor has been registered through use of
100 then setting this flag will tell the kernel that the
102 passed in is the registered ring offset rather than a normal file descriptor.
106 If the io_uring instance was configured for polling, by specifying
108 in the call to
111 of 0 instructs the kernel to return any events which are already complete,
114 is a non-zero value, the kernel will still return immediately if any
116 available, then the call will poll either until one or more
117 completions become available, or until the process has exceeded its
122 was not specified in the call to
124 an application may check the completion queue for event completions
125 without entering the kernel at all.
127 When the system call returns that a certain amount of SQEs have been
128 consumed and submitted, it's safe to reuse SQE entries in the ring. This is
129 true even if the actual IO submission had to be punted to async context,
130 which means that the SQE may in fact not have been submitted yet. If the
141 first replaces the current signal mask by the one pointed to by
143 then waits for events to become available in the completion queue, and
144 then restores the original signal mask. The following
156 executing the following calls:
166 See the description of
168 for an explanation of why the
172 Submission queue entries are represented using the following data
183 __u16 ioprio; /* ioprio for the request */
237 describes the operation to be performed. It can be one of:
240 Do not perform any I/O. This is useful for testing the performance of
250 If the file is not seekable,
266 Note that, while I/O is initiated in the order in which it appears in
268 application which places a write I/O followed by an fsync in the
269 submission queue cannot expect the fsync to apply to the write. The
270 two operations execute in parallel, so the fsync may complete before
271 the write is issued to the storage. The same is also true for
272 previously issued writes that have not completed prior to the fsync.
276 Poll the
278 specified in the submission queue entry for the events
279 specified in the
283 by default this interface always works in one shot mode. That is, once the poll
288 is set in the SQE
290 field, then the poll will work in multi shot mode instead. That means it'll
291 repatedly trigger when the requested event becomes true, and hence multiple
292 CQEs can be generated from this single SQE. The CQE
296 set on completion if the application should expect further CQE entries from
297 the original request. If this flag isn't set on completion, then the poll
303 is set in the SQE
305 field, then the request will update an existing poll request with the mask of
306 events passed in with this request. The lookup is based on the
308 field of the original SQE submitted, and this values is passed in the
310 field of the SQE. This mode is available since 5.13.
314 is set in the SQE
316 field, then the request will update the
318 of an existing poll request based on the value passed in the
325 and the completion event result is the returned mask of events. For the
330 , the completion result will be similar to
335 Remove an existing poll request. If found, the
337 field of the
345 if the poll request was in the process of completing already.
349 Add, remove or modify entries in the interest list of
353 for details of the system call.
355 holds the file descriptor that represents the epoll instance,
357 holds the file descriptor to add, remove or modify,
359 holds the operation (EPOLL_CTL_ADD, EPOLL_CTL_DEL, EPOLL_CTL_MOD) to perform and,
361 holds a pointer to the
367 Issue the equivalent of a \fBsync_file_range\fR (2) on the file descriptor. The
369 field is the file descriptor to sync, the
371 field holds the offset in bytes, the
373 field holds the length in bytes, and the
375 field holds the flags for the command. See also
377 for the general description of the related system call. Available since 5.2.
381 Issue the equivalent of a
385 must be set to the socket file descriptor,
387 must contain a pointer to the msghdr structure, and
389 holds the flags associated with the system call. See also
391 for the general description of the related system call. Available since 5.3.
397 instead. See the description of IORING_OP_SENDMSG. Available since 5.3.
401 Issue the equivalent of a
405 must be set to the socket file descriptor,
407 must contain a pointer to the buffer,
409 denotes the length of the buffer to send, and
411 holds the flags associated with the system call. See also
413 for the general description of the related system call. Available since 5.6.
419 instead. See the description of IORING_OP_SEND. Available since 5.6.
423 This command will register a timeout operation. The
433 will trigger a wakeup event on the completion ring for anyone waiting for
434 events. A timeout condition is met when either the specified timeout expires,
435 or the specified number of events have completed. Either condition will
436 trigger the event. If set to 0, completed events are not counted, which
437 effectively acts like a timer. io_uring timeouts use the
439 clock source. The request will complete with
441 if the timeout got completed through expiration of the timer, or
443 if the timeout got completed through requests completing on their own. If
444 the timeout was canceled before it expired, the request will complete with
448 Since 5.15, this command also supports the following modifiers in
454 If set, then the clocksource used is
458 This clocksource differs in that it includes time elapsed if the system was
462 If set, then the clocksource used is
476 must contain the
478 field of the previously issued timeout operation. If the specified timeout
482 If the timeout request was found but expiration was already in progress,
485 If the timeout request wasn't found, the request will terminate with a result
500 may also contain IORING_TIMEOUT_ABS, in which case the value given is an
506 Issue the equivalent of an
510 must be set to the socket file descriptor,
512 must contain the pointer to the sockaddr structure, and
514 must contain a pointer to the socklen_t addrlen field. Flags can be passed using
519 for the general description of the related system call. Available since 5.5.
521 If the
523 field is set to a positive number, the file won't be installed into the
524 normal file table as usual but will be placed into the fixed file table at index
526 In this case, instead of returning a file descriptor, the result will contain
527 either 0 on success or an error. If the index points to a valid empty slot, the
528 installation is guaranteed to not fail. If there is already a file in the slot,
543 must contain the
545 field of the request that should be canceled. The cancelation request will
546 complete with one of the following results codes. If found, the
548 field of the cqe will contain 0. If not found,
550 will contain -ENOENT. If found and attempted canceled, the
552 field will contain -EALREADY. In this case, the request may or may not
564 acts on the linked request, not the completion queue. The format of the command
568 If used, the timeout specified in the command will cancel the linked command,
569 unless the linked command completes before the timeout. The timeout will
572 if the timer expired and the linked request was attempted canceled, or
574 if the timer got canceled because of completion of the linked request. Like
583 Issue the equivalent of a
587 must be set to the socket file descriptor,
589 must contain the const pointer to the sockaddr structure, and
591 must contain the socklen_t addrlen field. See also
593 for the general description of the related system call. Available since 5.5.
597 Issue the equivalent of a
601 must be set to the file descriptor,
603 must contain the mode associated with the operation,
605 must contain the offset on which to operate, and
607 must contain the length. See also
609 for the general description of the related system call. Available since 5.6.
613 Issue the equivalent of a
617 must be set to the file descriptor,
619 must contain the offset on which to operate,
621 must contain the length, and
623 must contain the advice associated with the operation. See also
625 for the general description of the related system call. Available since 5.6.
629 Issue the equivalent of a
633 must contain the address to operate on,
635 must contain the length on which to operate,
638 must contain the advice associated with the operation. See also
640 for the general description of the related system call. Available since 5.6.
644 Issue the equivalent of a
648 is the
652 must contain a pointer to the
658 is access mode of the file. See also
660 for the general description of the related system call. Available since 5.6.
662 If the
664 field is set to a positive number, the file won't be installed into the
665 normal file table as usual but will be placed into the fixed file table at index
667 In this case, instead of returning a file descriptor, the result will contain
668 either 0 on success or an error. If the index points to a valid empty slot, the
669 installation is guaranteed to not fail. If there is already a file in the slot,
682 Issue the equivalent of a
686 is the
690 must contain a pointer to the
694 should contain the size of the open_how structure, and
696 should be set to the address of the open_how structure. See also
698 for the general description of the related system call. Available since 5.6.
700 If the
702 field is set to a positive number, the file won't be installed into the
703 normal file table as usual but will be placed into the fixed file table at index
705 In this case, instead of returning a file descriptor, the result will contain
706 either 0 on success or an error. If the index points to a valid empty slot, the
707 installation is guaranteed to not fail. If there is already a file in the slot,
720 Issue the equivalent of a
724 is the file descriptor to be closed. See also
726 for the general description of the related system call. Available since 5.6.
727 If the
736 using the io_uring specific direct descriptors. Note that only one of the
737 descriptor fields may be set. The direct close feature is available since
742 Issue the equivalent of a
746 is the
750 must contain a pointer to the
754 is the
758 should be the
762 must contain a pointer to the
766 for the general description of the related system call. Available since 5.6.
772 Issue the equivalent of a
778 is the file descriptor to be operated on,
780 contains the buffer in question,
782 contains the length of the IO operation, and
784 contains the read or write offset. If
792 , the offset will use (and advance) the file position, like the
796 system calls. These are non-vectored versions of the
804 for the general description of the related system call. Available since 5.6.
808 Issue the equivalent of a
812 is the file descriptor to read from,
816 is the file descriptor to write to,
820 is used to pass the equivalent of a NULL for the offsets to
823 contains the number of bytes to copy.
825 contains a bit mask for the flag field associated with the system call.
826 Please note that one of the file descriptors must refer to a pipe.
829 for the general description of the related system call. Available since 5.7.
833 Issue the equivalent of a
837 is the file descriptor to read from,
839 is the file descriptor to write to,
841 contains the number of bytes to copy, and
843 contains a bit mask for the flag field associated with the system call.
844 Please note that both of the file descriptors must refer to a pipe.
847 for the general description of the related system call. Available since 5.8.
853 which then works in an async fashion, like the rest of the io_uring commands.
854 The arguments passed in are the same.
856 must contain a pointer to the array of file descriptors,
858 must contain the length of the array, and
860 must contain the offset at which to operate. Note that the array of file
869 the need to separate the poll + read, which provides a convenient point in
871 as many buffers available as pending reads or receive. With this feature, the
872 application can have its pool of buffers ready in the kernel, and when the
873 file or socket is ready to read/receive data, a buffer can be selected for the
876 must contain the number of buffers to provide,
878 must contain the starting address to add buffers from,
880 must contain the length of each buffer to add from the range,
882 must contain the group ID of this range of buffers, and
884 must contain the starting buffer ID of this range of buffers. With that set,
885 the kernel adds buffers starting with the memory address in
889 Hence the application should provide
893 Buffers are grouped by the group ID, and each buffer within this group will be
894 identical in size according to the above arguments. This allows the application
896 differently sized buffers available depending on what the expectations are of
898 buffer, the
902 must be set to the desired buffer group ID where the buffer should be selected
910 must contain the number of buffers to remove, and
912 must contain the buffer group ID from which to remove the buffers. Available
917 Issue the equivalent of a
921 is the file descriptor to the socket being shutdown, and
923 must be set to the
929 Issue the equivalent of a
933 should be set to the
936 should be set to the
939 should be set to the
942 should be set to the
945 should be set to the
949 should be set to the
957 Issue the equivalent of a
961 should be set to the
964 should be set to the
968 should be set to the
976 Issue the equivalent of a
980 should be set to the
983 should be set to the
987 should be set to the
995 Issue the equivalent of a
999 should be set to the
1002 should be set to the
1006 should be set to the
1014 Issue the equivalent of a
1018 should be set to the
1021 should be set to the
1024 should be set to the
1027 should be set to the
1031 should be set to the
1041 must be set to a file descriptor of a ring that the application has access to,
1043 can be set to any 32-bit value that the application wishes to pass on, and
1045 should be set any 64-bit value that the application wishes to send. On the
1046 target ring, a CQE will be posted with the
1048 field matching the
1052 field matching the
1055 interrupt anyone waiting for completions on the target ring, ot it can be used
1056 to pass messages via the two fields. Available since 5.18.
1061 field is a bit mask. The supported flags are:
1066 is an index into the files array registered with the io_uring instance (see the
1068 section of the
1071 a command that doesn't support fixed files, the SQE will error with
1076 When this flag is specified, the SQE will not be started before previously
1081 When this flag is specified, the SQE forms a link with the next SQE in the
1082 submission ring. That next SQE will not be started before the previous request
1084 long. The tail of the chain is denoted by the first SQE that does not have this
1085 flag set. Chains are not supported across submission boundaries. Even if the
1086 last SQE in a submission has this flag set, it will still terminate the current
1088 SQEs that are outside of the chain tail. This means that multiple chains can be
1089 executing in parallel, or chains and individual SQEs. Only members inside the
1092 means that, eg, a short read will also terminate the remainder of the chain.
1093 If a chain of SQE links is broken, the remaining unstarted part of the chain
1096 as the error code. Available since 5.3.
1099 Like IOSQE_IO_LINK, but it doesn't sever regardless of the completion result.
1100 Note that the link will still sever if we fail submitting the parent request,
1101 hard links are only resilient in the presence of completion results for
1108 overlapped operation of requests that the application knows/assumes will
1109 always (or most of the time) block, the application can ask for an sqe to be
1110 issued async from the start. Available since 5.6.
1113 Used in conjunction with the
1117 flag is set in the command, io_uring will grab a buffer from this pool when
1118 the request is ready to receive or read data. If successful, the resulting CQE
1121 set in the flags part of the struct, and the upper
1123 bits will contain the ID of the selected buffers. This allows the application
1124 to know exactly which buffer was selected for the operation. If no buffers
1125 are available and this flag is set, then the request will fail with
1127 as the error code. Once a buffer has been used, it is no longer available in
1128 the kernel pool. The application must re-register the given buffer again when
1132 Don't generate a CQE if the request completes successfully. If the request
1135 CQEs for all linked requests will be omitted. The notion of failure/success is
1136 opcode specific and is the same as with breaking chains of
1138 One special case is when the request has a linked timeout, then the CQE
1139 generation for the linked timeout is decided solely by whether it has
1142 linked timeout has the flag set, it's guaranteed to not post a CQE.
1145 the last request of a normal link without linked timeouts are marked with the
1147 CQEs in cases where the side effects of a successfully executed operation is
1148 enough for userspace to know the state of the system. One such example would
1154 not used together in a single request. Currently, after the first request with
1157 Note that the error reporting is best effort only, and restrictions may change
1158 in the future.
1164 specifies the I/O priority. See
1169 specifies the file descriptor against which the operation will be
1170 performed, with the exception noted above.
1172 If the operation is one of
1179 must fall within the buffer located at
1181 in the fixed buffer array. If the operation is either
1193 per-I/O flags, as described in the
1201 to provide data sync only semantics. See the descriptions of
1205 in the
1221 is the credentials id to use for this operation. See
1223 for how to register personalities with io_uring. If set to 0, the current
1224 personality of the submitting task is used.
1226 Once the submission queue entry is initialized, I/O is submitted by
1227 placing the index of the submission queue entry into the tail of the
1228 submission queue. After one or more indexes are added to the queue,
1229 and the queue tail is advanced, the
1231 system call can be invoked to initiate the I/O.
1233 Completions use the following data structure:
1249 is copied from the field of the same name in the submission queue
1250 entry. The primary use case is to store data that the application
1251 will need to access upon completion of this particular I/O. The
1261 is the operation-specific result, but io_uring-specific errors
1266 For read and write opcodes, the
1269 values documented in the
1276 holding the equivalent of
1278 for error cases, or the transferred number of bytes in case the operation
1280 field in the CQE. For other request types, the return values are documented
1281 in the matching man page for that type, or in the opcodes section above for
1286 returns the number of I/Os successfully consumed. This can be zero
1289 was zero or if the submission queue was empty. Note that if the ring was
1292 specified, then the return value will generally be the same as
1294 as submission happens outside the context of the system call.
1299 rather than through the system call itself.
1301 Errors that occur not on behalf of a submission queue entry are returned via the
1309 These are the errors returned by
1314 The kernel was unable to allocate memory for the request, or otherwise ran out
1315 of resources to handle it. The application should wait for some completions and
1324 is a valid file descriptor, but the io_uring ring is not in the right state
1327 for details on how to enable the ring.
1330 The application is attempting to overcommit the number of requests it can have
1331 pending. The application should wait for some completions and try again. May
1332 occur if the application tries to queue more requests than we have room for in
1333 the CQ ring, or if the application attempts to wait for more events without
1334 having reaped the ones already present in the CQ ring.
1337 Some bits in the
1342 An invalid user space address was specified for the
1347 The io_uring instance is in the process of being torn down.
1361 These io_uring-specific errors are returned as a negative value in the
1363 field of the completion queue entry.
1378 field in the submission queue entry is invalid, or the
1380 flag was set in the submission queue entry, but no files were registered
1381 with the io_uring instance.
1384 buffer is outside of the process' accessible address space
1390 was specified in the
1392 field of the submission queue entry, but either buffers were not
1393 registered for this io_uring instance, or the address range described
1398 does not fit within the buffer registered at
1411 member of the submission queue entry is invalid.
1420 was specified in the submission queue entry, but the io_uring context
1423 was specified in the call to io_uring_setup).
1429 was specified in the submission queue entry, but the io_uring instance
1436 was specified in the submission queue entry, and the
1447 was specified in the submission queue entry, but the io_uring instance
1455 was set in the submission queue entry.
1461 was specified in the
1463 field of the submission queue entry, but the io_uring instance was
1472 was non-zero in the submission queue entry.
1476 was specified in the
1478 field of the submission queue entry, and the
1488 was set in the
1490 field of the submission queue entry, but the