aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs/nfs4proc.c
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4.2: support EXCHGID4_FLAG_SUPP_FENCE_OPS 4.2 EXCHANGE_ID flagOlga Kornievskaia2020-10-161-3/+6
| | | | | | | | | | | | | RFC 7862 introduced a new flag that either client or server is allowed to set: EXCHGID4_FLAG_SUPP_FENCE_OPS. Client needs to update its bitmask to allow for this flag value. v2: changed minor version argument to unsigned int Signed-off-by: Olga Kornievskaia <kolga@netapp.com> CC: <stable@vger.kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Use the net namespace uniquifier if it is setTrond Myklebust2020-10-091-4/+17
| | | | | | | | If a container sets a net namespace specific uniquifier, then use that in the setclientid/exchangeid process. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Clean up initialisation of uniquified client id stringsTrond Myklebust2020-10-091-41/+34
| | | | | | | | When the user sets a uniquifier, then ensure we copy the string so that calls to strlen() etc are atomic with calls to snprintf(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Add READ_PLUS data segment supportAnna Schumaker2020-10-071-3/+40
| | | | | | | | This patch adds client support for decoding a single NFS4_CONTENT_DATA segment returned by the server. This is the simplest implementation possible, since it does not account for any hole segments in the reply. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADEBenjamin Coddington2020-10-021-34/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE races with the update of the nfs_state: Process 1 Process 2 Server ========= ========= ======== OPEN file OPEN file Reply OPEN (1) Reply OPEN (2) Update state (1) CLOSE file (1) Reply OLD_STATEID (1) CLOSE file (2) Reply CLOSE (-1) Update state (2) wait for state change OPEN file wake CLOSE file OPEN file wake CLOSE file ... ... We can avoid this situation by not issuing an immediate retry with a bumped seqid when CLOSE/OPEN_DOWNGRADE receives NFS4ERR_OLD_STATEID. Instead, take the same approach used by OPEN and wait at least 5 seconds for outstanding stateid updates to complete if we can detect that we're out of sequence. Note that after this change it is still possible (though unlikely) that CLOSE waits a full 5 seconds, bumps the seqid, and retries -- and that attempt races with another OPEN at the same time. In order to avoid this race (which would result in the livelock), update nfs_need_update_open_stateid() to handle the case where: - the state is NFS_OPEN_STATE, and - the stateid doesn't match the current open stateid Finally, nfs_need_update_open_stateid() is modified to be idempotent and renamed to better suit the purpose of signaling that the stateid passed is the next stateid in sequence. Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: make cache consistency bitmask dynamicOlga Kornievskaia2020-09-241-3/+42
| | | | | | | | | | | | | | Client uses static bitmask for GETATTR on CLOSE/WRITE/DELEGRETURN and ignores the fact that it might have some attributes marked invalid in its cache. Compared to v3 where all attributes are retrieved in postop attributes, v4's cache is frequently out of sync and leads to standalone GETATTRs being sent to the server. Instead, in addition to the minimum cache consistency attributes also check cache_validity and adjust the GETATTR request accordingly. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* nfs4: strengthen error check to avoid unexpected resultChengguang Xu2020-09-211-1/+1
| | | | | | | | | | The variable error is ssize_t, which is signed and will cast to unsigned when comapre with variable size, so add a check to avoid unexpected result in case of negative value of error. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: remove redundant pointer clntColin Ian King2020-09-211-3/+1
| | | | | | | | | | | | The pointer clnt is being initialized with a value that is never read and so this is assignment redundant and can be removed. The pointer can removed because it is being used as a temporary variable and it is clearer to make the direct assignment and remove it completely. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2020-09-091-2/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: - Fix an NFS/RDMA resource leak - Fix the error handling during delegation recall - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR - Stop printk reading past end of string * tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: stop printk reading past end of string NFS: Zero-stateid SETATTR should first return delegation NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall xprtrdma: Release in-flight MRs on disconnect
| * NFS: Zero-stateid SETATTR should first return delegationChuck Lever2020-09-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a write delegation isn't available, the Linux NFS client uses a zero-stateid when performing a SETATTR. NFSv4.0 provides no mechanism for an NFS server to match such a request to a particular client. It recalls all delegations for that file, even delegations held by the client issuing the request. If that client happens to hold a read delegation, the server will recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/ DELEGRETURN sequence. Optimize out this pipeline bubble by having the client return any delegations it may hold on a file before it issues a SETATTR(zero-stateid) on that file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recallOlga Kornievskaia2020-08-261-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | A client should be able to handle getting an ERR_DELAY error while doing a LOCK call to reclaim state due to delegation being recalled. This is a transient error that can happen due to server moving its volumes and invalidating its file location cache and upon reference to it during the LOCK call needing to do an expensive lookup (leading to an ERR_DELAY error on a PUTFH). Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* | treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-08-231-16/+16
|/ | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* Merge tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2020-08-151-33/+206
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Stable fixes: - pNFS: Don't return layout segments that are being used for I/O - pNFS: Don't move layout segments off the active list when being used for I/O Features: - NFS: Add support for user xattrs through the NFSv4.2 protocol - NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNC - NFSv4.0 allow nconnect for v4.0 Bugfixes and cleanups: - nfs: ensure correct writeback errors are returned on close() - nfs: nfs_file_write() should check for writeback errors - nfs: Fix getxattr kernel panic and memory overflow - NFS: Fix the pNFS/flexfiles mirrored read failover code - SUNRPC: dont update timeout value on connection reset - freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS - sunrpc: destroy rpc_inode_cachep after unregister_filesystem" * tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (32 commits) NFS: Fix flexfiles read failover fs: nfs: delete repeated words in comments rpc_pipefs: convert comma to semicolon nfs: Fix getxattr kernel panic and memory overflow NFS: Don't return layout segments that are in use NFS: Don't move layouts to plh_return_segs list while in use NFS: Add layout segment info to pnfs read/write/commit tracepoints NFS: Add tracepoints for layouterror and layoutstats. NFS: Report the stateid + status in trace_nfs4_layoutreturn_on_close() SUNRPC dont update timeout value on connection reset nfs: nfs_file_write() should check for writeback errors nfs: ensure correct writeback errors are returned on close() NFSv4.2: xattr cache: get rid of cache discard work queue NFS: remove redundant initialization of variable result NFSv4.0 allow nconnect for v4.0 freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS sunrpc: destroy rpc_inode_cachep after unregister_filesystem NFSv4.2: add client side xattr caching. NFSv4.2: hook in the user extended attribute handlers NFSv4.2: add the extended attribute proc functions. ...
| * nfs: Fix getxattr kernel panic and memory overflowJeffrey Mitchell2020-08-121-2/+0
| | | | | | | | | | | | | | | | | | | | Move the buffer size check to decode_attr_security_label() before memcpy() Only call memcpy() if the buffer is large enough Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io> [Trond: clean up duplicate test of label->len != 0] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFSHe Zhe2020-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0688e64bc600 ("NFS: Allow signal interruption of NFS4ERR_DELAYed operations") introduces nfs4_delay_interruptible which also needs an _unsafe version to avoid the following call trace for the same reason explained in commit 416ad3c9c006 ("freezer: add unsafe versions of freezable helpers for NFS") CPU: 4 PID: 3968 Comm: rm Tainted: G W 5.8.0-rc4 #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace+0x0/0x1dc show_stack+0x20/0x30 dump_stack+0xdc/0x150 debug_check_no_locks_held+0x98/0xa0 nfs4_delay_interruptible+0xd8/0x120 nfs4_handle_exception+0x130/0x170 nfs4_proc_rmdir+0x8c/0x220 nfs_rmdir+0xa4/0x360 vfs_rmdir.part.0+0x6c/0x1b0 do_rmdir+0x18c/0x210 __arm64_sys_unlinkat+0x64/0x7c el0_svc_common.constprop.0+0x7c/0x110 do_el0_svc+0x24/0xa0 el0_sync_handler+0x13c/0x1b8 el0_sync+0x158/0x180 Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4.2: add client side xattr caching.Frank van der Linden2020-07-131-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement client side caching for NFSv4.2 extended attributes. The cache is a per-inode hashtable, with name/value entries. There is one special entry for the listxattr cache. NFS inodes have a pointer to a cache structure. The cache structure is allocated on demand, freed when the cache is invalidated. Memory shrinkers keep the size in check. Large entries (> PAGE_SIZE) are collected by a separate shrinker, and freed more aggressively than others. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4.2: hook in the user extended attribute handlersFrank van der Linden2020-07-131-2/+121
| | | | | | | | | | | | | | | | | | Now that all the lower level code is there to make the RPC calls, hook it in to the xattr handlers and the listxattr entry point, to make them available. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * nfs: make the buf_to_pages_noslab function available to the nfs codeFrank van der Linden2020-07-131-2/+2
| | | | | | | | | | | | | | | | | | | | Make the buf_to_pages_noslab function available to the rest of the NFS code. Rename it to nfs4_buf_to_pages_noslab to be consistent. This will be used later in the NFSv4.2 xattr code. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * nfs: define and use the NFS_INO_INVALID_XATTR flagFrank van der Linden2020-07-131-1/+2
| | | | | | | | | | | | | | | | | | | | Define the NFS_INO_INVALID_XATTR flag, to be used for the NFSv4.2 xattr cache, and use it where appropriate. No functional change as yet. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * nfs: modify update_changeattr to deal with regular filesFrank van der Linden2020-07-131-26/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, change attributes in change_info form were only returned by directory operations. However, they are also used for the RFC 8276 extended attribute operations, which work on both directories and regular files. Modify update_changeattr to deal: * Rename it to nfs4_update_changeattr and make it non-static. * Don't always use INO_INVALID_DATA, this isn't needed for a directory that only had its extended attributes changed by us. * Existing callers now always pass in INO_INVALID_DATA. For the current callers of this function, behavior is unchanged. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4.2: query the extended attribute access bitsFrank van der Linden2020-07-131-0/+6
| | | | | | | | | | | | | | | | RFC 8276 defines separate ACCESS bits for extended attribute checking. Query them in nfs_do_access and opendata. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4.2: query the server for extended attribute supportFrank van der Linden2020-07-131-1/+2
| | | | | | | | | | | | | | | | Query the server for extended attribute support, and record it as the NFS_CAP_XATTR flag in the server capabilities. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| |
| \
*-. | Merge branches 'pm-sleep', 'pm-domains', 'powercap' and 'pm-tools'Rafael J. Wysocki2020-08-031-1/+1
|\ \| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-sleep: PM: sleep: spread "const char *" correctness PM: hibernate: fix white space in a few places freezer: Add unsafe version of freezable_schedule_timeout_interruptible() for NFS PM: sleep: core: Emit changed uevent on wakeup_sysfs_add/remove * pm-domains: PM: domains: Restore comment indentation for generic_pm_domain.child_links PM: domains: Fix up terminology with parent/child * powercap: powercap: Add Power Limit4 support powercap: idle_inject: Replace play_idle() with play_idle_precise() in comments powercap: intel_rapl: add support for Sapphire Rapids * pm-tools: pm-graph v5.7 - important s2idle fixes cpupower: Replace HTTP links with HTTPS ones cpupower: Fix NULL but dereferenced coccicheck errors cpupower: Fix comparing pointer to 0 coccicheck warns
| * | freezer: Add unsafe version of freezable_schedule_timeout_interruptible() ↵He Zhe2020-07-141-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for NFS commit 0688e64bc600 ("NFS: Allow signal interruption of NFS4ERR_DELAYed operations") introduces nfs4_delay_interruptible which also needs an _unsafe version to avoid the following call trace for the same reason explained in commit 416ad3c9c006 ("freezer: add unsafe versions of freezable helpers for NFS") CPU: 4 PID: 3968 Comm: rm Tainted: G W 5.8.0-rc4 #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace+0x0/0x1dc show_stack+0x20/0x30 dump_stack+0xdc/0x150 debug_check_no_locks_held+0x98/0xa0 nfs4_delay_interruptible+0xd8/0x120 nfs4_handle_exception+0x130/0x170 nfs4_proc_rmdir+0x8c/0x220 nfs_rmdir+0xa4/0x360 vfs_rmdir.part.0+0x6c/0x1b0 do_rmdir+0x18c/0x210 __arm64_sys_unlinkat+0x64/0x7c el0_svc_common.constprop.0+0x7c/0x110 do_el0_svc+0x24/0xa0 el0_sync_handler+0x13c/0x1b8 el0_sync+0x158/0x180 Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* / NFS: Fix interrupted slots by sending a solo SEQUENCE operationAnna Schumaker2020-07-131-2/+18
|/ | | | | | | | | | | | | | | We used to do this before 3453d5708b33, but this was changed to better handle the NFS4ERR_SEQ_MISORDERED error code. This commit fixed the slot re-use case when the server doesn't receive the interrupted operation, but if the server does receive the operation then it could still end up replying to the client with mis-matched operations from the reply cache. We can fix this by sending a SEQUENCE to the server while recovering from a SEQ_MISORDERED error when we detect that we are in an interrupted slot situation. Fixes: 3453d5708b33 (NFSv4.1: Avoid false retries when RPC calls are interrupted) Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.1 fix rpc_call_done assignment for BIND_CONN_TO_SESSIONOlga Kornievskaia2020-05-271-1/+1
| | | | | | Fixes: 02a95dee8cf0 ("NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Don't use RPC_TASK_CRED_NOREF with delegreturnTrond Myklebust2020-05-111-1/+1
| | | | | | | We are not guaranteed that the credential will remain pinned. Fixes: 612965072020 ("NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/O") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSIONOlga Kornievskaia2020-04-281-0/+8
| | | | | | | | | | | | | | | Currently, if the client sends BIND_CONN_TO_SESSION with NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores that it wasn't able to enable a backchannel. To make sure, the client sends BIND_CONN_TO_SESSION as the first operation on the connections (ie., no other session compounds haven't been sent before), and if the client's request to bind the backchannel is not satisfied, then reset the connection and retry. Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4: Remove unreachable error condition due to rpc_run_task()Xiyu Yang2020-04-251-2/+1
| | | | | | | | | | | nfs4_proc_layoutget() invokes rpc_run_task(), which return the value to "task". Since rpc_run_task() is impossible to return an ERR pointer, there is no need to add the IS_ERR() condition on "task" here. So we need to remove it. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* Merge tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2020-04-071-7/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page leak in nfs_destroy_unlinked_subrequests() - Fix use-after-free issues in nfs_pageio_add_request() - Fix new mount code constant_table array definitions - finish_automount() requires us to hold 2 refs to the mount record Features: - Improve the accuracy of telldir/seekdir by using 64-bit cookies when possible. - Allow one RDMA active connection and several zombie connections to prevent blocking if the remote server is unresponsive. - Limit the size of the NFS access cache by default - Reduce the number of references to credentials that are taken by NFS - pNFS files and flexfiles drivers now support per-layout segment COMMIT lists. - Enable partial-file layout segments in the pNFS/flexfiles driver. - Add support for CB_RECALL_ANY to the pNFS flexfiles layout type - pNFS/flexfiles Report NFS4ERR_DELAY and NFS4ERR_GRACE errors from the DS using the layouterror mechanism. Bugfixes and cleanups: - SUNRPC: Fix krb5p regressions - Don't specify NFS version in "UDP not supported" error - nfsroot: set tcp as the default transport protocol - pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() - alloc_nfs_open_context() must use the file cred when available - Fix locking when dereferencing the delegation cred - Fix memory leaks in O_DIRECT when nfs_get_lock_context() fails - Various clean ups of the NFS O_DIRECT commit code - Clean up RDMA connect/disconnect - Replace zero-length arrays with C99-style flexible arrays" * tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (86 commits) NFS: Clean up process of marking inode stale. SUNRPC: Don't start a timer on an already queued rpc task NFS/pnfs: Reference the layout cred in pnfs_prepare_layoutreturn() NFS/pnfs: Fix dereference of layout cred in pnfs_layoutcommit_inode() NFS: Beware when dereferencing the delegation cred NFS: Add a module parameter to set nfs_mountpoint_expiry_timeout NFS: finish_automount() requires us to hold 2 refs to the mount record NFS: Fix a few constant_table array definitions NFS: Try to join page groups before an O_DIRECT retransmission NFS: Refactor nfs_lock_and_join_requests() NFS: Reverse the submission order of requests in __nfs_pageio_add_request() NFS: Clean up nfs_lock_and_join_requests() NFS: Remove the redundant function nfs_pgio_has_mirroring() NFS: Fix memory leaks in nfs_pageio_stop_mirroring() NFS: Fix a request reference leak in nfs_direct_write_clear_reqs() NFS: Fix use-after-free issues in nfs_pageio_add_request() NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests() NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() NFS: Remove unused FLUSH_SYNC support in nfs_initiate_pgio() pNFS/flexfiles: Specify the layout segment range in LAYOUTGET ...
| * NFS/pnfs: Reference the layout cred in pnfs_prepare_layoutreturn()Trond Myklebust2020-04-031-0/+1
| | | | | | | | | | | | | | When we're sending a layoutreturn, ensure that we reference the layout cred atomically with the copy of the stateid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFS: Beware when dereferencing the delegation credTrond Myklebust2020-04-031-0/+3
| | | | | | | | | | | | | | | | | | | | When we look up the delegation cred, we are usually doing so in conjunction with a read of the stateid, and we want to ensure that the look up is atomic with that read. Fixes: 57f188e04773 ("NFSv4: nfs_update_inplace_delegation() should update delegation cred") [sfr@canb.auug.org.au: Fixed up borken Fixes: line from Trond :-)] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * nfs: Replace zero-length array with flexible-array memberGustavo A. R. Silva2020-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4: Avoid unnecessary credential references in layoutgetTrond Myklebust2020-03-161-1/+1
| | | | | | | | | | | | Layoutget is just using the credential attached to the open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/OTrond Myklebust2020-03-161-5/+5
| | | | | | | | | | | | | | Avoid unnecessary references to the cred when we have already referenced it through the open context or the open owner. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
| * NFS: Ensure we time out if a delegreturn does not completeTrond Myklebust2020-03-161-1/+2
| | | | | | | | | | | | | | | | | | We can't allow delegreturn to hold up nfs4_evict_inode() forever, since that can cause the memory shrinkers to block. This patch therefore ensures that we eventually time out, and complete the reclaim of the inode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* | Merge tag 'selinux-pr-20200330' of ↵Linus Torvalds2020-03-311-9/+3
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux updates from Paul Moore: "We've got twenty SELinux patches for the v5.7 merge window, the highlights are below: - Deprecate setting /sys/fs/selinux/checkreqprot to 1. This flag was originally created to deal with legacy userspace and the READ_IMPLIES_EXEC personality flag. We changed the default from 1 to 0 back in Linux v4.4 and now we are taking the next step of deprecating it, at some point in the future we will take the final step of rejecting 1. - Allow kernfs symlinks to inherit the SELinux label of the parent directory. In order to preserve backwards compatibility this is protected by the genfs_seclabel_symlinks SELinux policy capability. - Optimize how we store filename transitions in the kernel, resulting in some significant improvements to policy load times. - Do a better job calculating our internal hash table sizes which resulted in additional policy load improvements and likely general SELinux performance improvements as well. - Remove the unused initial SIDs (labels) and improve how we handle initial SIDs. - Enable per-file labeling for the bpf filesystem. - Ensure that we properly label NFS v4.2 filesystems to avoid a temporary unlabeled condition. - Add some missing XFS quota command types to the SELinux quota access controls. - Fix a problem where we were not updating the seq_file position index correctly in selinuxfs. - We consolidate some duplicated code into helper functions. - A number of list to array conversions. - Update Stephen Smalley's email address in MAINTAINERS" * tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: clean up indentation issue with assignment statement NFS: Ensure security label is set for root inode MAINTAINERS: Update my email address selinux: avtab_init() and cond_policydb_init() return void selinux: clean up error path in policydb_init() selinux: remove unused initial SIDs and improve handling selinux: reduce the use of hard-coded hash sizes selinux: Add xfs quota command types selinux: optimize storage of filename transitions selinux: factor out loop body from filename_trans_read() security: selinux: allow per-file labeling for bpffs selinux: generalize evaluate_cond_node() selinux: convert cond_expr to array selinux: convert cond_av_list to array selinux: convert cond_list to array selinux: sel_avc_get_stat_idx should increase position index selinux: allow kernfs symlinks to inherit parent directory context selinux: simplify evaluate_cond_node() Documentation,selinux: deprecate setting checkreqprot to 1 selinux: move status variables out of selinux_ss
| * NFS: Ensure security label is set for root inodeScott Mayhew2020-03-301-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using NFSv4.2, the security label for the root inode should be set via a call to nfs_setsecurity() during the mount process, otherwise the inode will appear as unlabeled for up to acdirmin seconds. Currently the label for the root inode is allocated, retrieved, and freed entirely witin nfs4_proc_get_root(). Add a field for the label to the nfs_fattr struct, and allocate & free the label in nfs_get_root(), where we also add a call to nfs_setsecurity(). Note that for the call to nfs_setsecurity() to succeed, it's necessary to also move the logic calling security_sb_{set,clone}_security() from nfs_get_tree_common() down into nfs_get_root()... otherwise the SBLABEL_MNT flag will not be set in the super_block's security flags and nfs_setsecurity() will silently fail. Reported-by: Richard Haines <richard_c_haines@btinternet.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Tested-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: fixed 80-char line width problems] Signed-off-by: Paul Moore <paul@paul-moore.com>
* | NFSv4.1 make cachethis=no for writesOlga Kornievskaia2020-02-131-1/+1
| | | | | | | | | | | | | | | | | | Turning caching off for writes on the server should improve performance. Fixes: fba83f34119a ("NFS: Pass "privileged" value to nfs4_init_sequence()") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | NFSv4: Fix races between open and dentry revalidationTrond Myklebust2020-02-101-2/+16
|/ | | | | | | | | | | | | | We want to make sure that we revalidate the dentry if and only if we've done an OPEN by filename. In order to avoid races with remote changes to the directory on the server, we want to save the verifier before calling OPEN. The exception is if the server returned a delegation with our OPEN, as we then know that the filename can't have changed on the server. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@gmail.com> Tested-by: Benjamin Coddington <bcodding@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewalsRobert Milkowski2020-02-041-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, each time nfs4_do_fsinfo() is called it will do an implicit NFS4 lease renewal, which is not compliant with the NFS4 specification. This can result in a lease being expired by an NFS server. Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases") introduced implicit client lease renewal in nfs4_do_fsinfo(), which can result in the NFSv4.0 lease to expire on a server side, and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID. This can easily be reproduced by frequently unmounting a sub-mount, then stat'ing it to get it mounted again, which will delay or even completely prevent client from sending RENEW operations if no other NFS operations are issued. Eventually nfs server will expire client's lease and return an error on file access or next RENEW. This can also happen when a sub-mount is automatically unmounted due to inactivity (after nfs_mountpoint_expiry_timeout), then it is mounted again via stat(). This can result in a short window during which client's lease will expire on a server but not on a client. This specific case was observed on production systems. This patch removes the implicit lease renewal from nfs4_do_fsinfo(). Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases") Signed-off-by: Robert Milkowski <rmilkowski@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: try lease recovery on NFS4ERR_EXPIREDRobert Milkowski2020-02-041-0/+5
| | | | | | | | | | Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), we return EIO to applications without even trying to recover. Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid") Signed-off-by: Robert Milkowski <rmilkowski@gmail.com> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Add softreval behaviour to nfs_lookup_revalidate()Trond Myklebust2020-01-241-10/+18
| | | | | | | | | If the server is unavaliable, we want to allow the revalidating lookup to time out, and to default to validating the cached dentry if the 'softreval' mount option is set. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.x recover from pre-mature loss of openstateidOlga Kornievskaia2020-01-151-0/+2
| | | | | | | | | | | | | | | | Ever since the commit 0e0cb35b417f, it's possible to lose an open stateid while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens, operations that require openstateid fail with EAGAIN which is propagated to the application then tests like generic/446 and generic/168 fail with "Resource temporarily unavailable". Instead of returning this error, initiate state recovery when possible to recover the open stateid and then try calling nfs4_select_rw_stateid() again. Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4 fix acl retrieval over krb5i/krb5p mountsOlga Kornievskaia2020-01-151-5/+13
| | | | | | | | | | | | | | | | | | For the krb5i and krb5p mount, it was problematic to truncate the received ACL to the provided buffer because an integrity check could not be preformed. Instead, provide enough pages to accommodate the largest buffer bounded by the largest RPC receive buffer size. Note: I don't think it's possible for the ACL to be truncated now. Thus NFS4_ACL_TRUNC flag and related code could be possibly removed but since I'm unsure, I'm leaving it. v2: needs +1 page. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Add mount option 'softreval'Trond Myklebust2020-01-151-7/+26
| | | | | | | | | | | | | | | | | Add a mount option 'softreval' that allows attribute revalidation 'getattr' calls to time out, and causes them to fall back to using the cached attributes. The use case for this option is for ensuring that we can still (slowly) traverse paths and use cached information even when the server is down. Once the server comes back up again, the getattr calls start succeeding, and the caches will revalidate as usual. The 'softreval' mount option is automatically enabled if you have specified 'softerr'. It can be turned off using the options 'nosoftreval', or 'hard'. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS4: Remove unneeded semicolonzhengbin2020-01-151-2/+2
| | | | | | | | | | | | Fixes coccicheck warning: fs/nfs/nfs4state.c:1138:2-3: Unneeded semicolon fs/nfs/nfs4proc.c:6862:2-3: Unneeded semicolon fs/nfs/nfs4proc.c:8629:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Add fs_context support.David Howells2020-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add filesystem context support to NFS, parsing the options in advance and attaching the information to struct nfs_fs_context. The highlights are: (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context. This structure represents NFS's superblock config. (*) Make use of the VFS's parsing support to split comma-separated lists (*) Pin the NFS protocol module in the nfs_fs_context. (*) Attach supplementary error information to fs_context. This has the downside that these strings must be static and can't be formatted. (*) Remove the auxiliary file_system_type structs since the information necessary can be conveyed in the nfs_fs_context struct instead. (*) Root mounts are made by duplicating the config for the requested mount so as to have the same parameters. Submounts pick up their parameters from the parent superblock. [AV -- retrans is u32, not string] [SM -- Renamed cfg to ctx in a few functions in an earlier patch] [SM -- Moved fs_context mount option parsing to an earlier patch] [SM -- Moved fs_context error logging to a later patch] [SM -- Fixed printks in nfs4_try_get_tree() and nfs4_get_referral_tree()] [SM -- Added is_remount_fc() helper] [SM -- Deferred some refactoring to a later patch] [SM -- Fixed referral mounts, which were broken in the original patch] [SM -- Fixed leak of nfs_fattr when fs_context is freed] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: add declaration of current_stateidBen Dooks2019-11-181-3/+3
| | | | | | | | | | | The current_stateid is exported from nfs4state.c but not declared in any of the headers. Add to nfs4_fs.h to remove the following warning: fs/nfs/nfs4state.c:80:20: warning: symbol 'current_stateid' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturnTrond Myklebust2019-11-131-1/+3
| | | | | | | | If nfs4_delegreturn_prepare needs to wait for a layoutreturn to complete then make sure we drop the sequence slot if we hold it. Fixes: 1c5bd76d17cc ("pNFS: Enable layoutreturn operation for return-on-close") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>