• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1 //! This module manages how the incremental compilation cache is represented in
2 //! the file system.
3 //!
4 //! Incremental compilation caches are managed according to a copy-on-write
5 //! strategy: Once a complete, consistent cache version is finalized, it is
6 //! never modified. Instead, when a subsequent compilation session is started,
7 //! the compiler will allocate a new version of the cache that starts out as
8 //! a copy of the previous version. Then only this new copy is modified and it
9 //! will not be visible to other processes until it is finalized. This ensures
10 //! that multiple compiler processes can be executed concurrently for the same
11 //! crate without interfering with each other or blocking each other.
12 //!
13 //! More concretely this is implemented via the following protocol:
14 //!
15 //! 1. For a newly started compilation session, the compiler allocates a
16 //!    new `session` directory within the incremental compilation directory.
17 //!    This session directory will have a unique name that ends with the suffix
18 //!    "-working" and that contains a creation timestamp.
19 //! 2. Next, the compiler looks for the newest finalized session directory,
20 //!    that is, a session directory from a previous compilation session that
21 //!    has been marked as valid and consistent. A session directory is
22 //!    considered finalized if the "-working" suffix in the directory name has
23 //!    been replaced by the SVH of the crate.
24 //! 3. Once the compiler has found a valid, finalized session directory, it will
25 //!    hard-link/copy its contents into the new "-working" directory. If all
26 //!    goes well, it will have its own, private copy of the source directory and
27 //!    subsequently not have to worry about synchronizing with other compiler
28 //!    processes.
29 //! 4. Now the compiler can do its normal compilation process, which involves
30 //!    reading and updating its private session directory.
31 //! 5. When compilation finishes without errors, the private session directory
32 //!    will be in a state where it can be used as input for other compilation
33 //!    sessions. That is, it will contain a dependency graph and cache artifacts
34 //!    that are consistent with the state of the source code it was compiled
35 //!    from, with no need to change them ever again. At this point, the compiler
36 //!    finalizes and "publishes" its private session directory by renaming it
37 //!    from "s-{timestamp}-{random}-working" to "s-{timestamp}-{SVH}".
38 //! 6. At this point the "old" session directory that we copied our data from
39 //!    at the beginning of the session has become obsolete because we have just
40 //!    published a more current version. Thus the compiler will delete it.
41 //!
42 //! ## Garbage Collection
43 //!
44 //! Naively following the above protocol might lead to old session directories
45 //! piling up if a compiler instance crashes for some reason before its able to
46 //! remove its private session directory. In order to avoid wasting disk space,
47 //! the compiler also does some garbage collection each time it is started in
48 //! incremental compilation mode. Specifically, it will scan the incremental
49 //! compilation directory for private session directories that are not in use
50 //! any more and will delete those. It will also delete any finalized session
51 //! directories for a given crate except for the most recent one.
52 //!
53 //! ## Synchronization
54 //!
55 //! There is some synchronization needed in order for the compiler to be able to
56 //! determine whether a given private session directory is not in used any more.
57 //! This is done by creating a lock file for each session directory and
58 //! locking it while the directory is still being used. Since file locks have
59 //! operating system support, we can rely on the lock being released if the
60 //! compiler process dies for some unexpected reason. Thus, when garbage
61 //! collecting private session directories, the collecting process can determine
62 //! whether the directory is still in use by trying to acquire a lock on the
63 //! file. If locking the file fails, the original process must still be alive.
64 //! If locking the file succeeds, we know that the owning process is not alive
65 //! any more and we can safely delete the directory.
66 //! There is still a small time window between the original process creating the
67 //! lock file and actually locking it. In order to minimize the chance that
68 //! another process tries to acquire the lock in just that instance, only
69 //! session directories that are older than a few seconds are considered for
70 //! garbage collection.
71 //!
72 //! Another case that has to be considered is what happens if one process
73 //! deletes a finalized session directory that another process is currently
74 //! trying to copy from. This case is also handled via the lock file. Before
75 //! a process starts copying a finalized session directory, it will acquire a
76 //! shared lock on the directory's lock file. Any garbage collecting process,
77 //! on the other hand, will acquire an exclusive lock on the lock file.
78 //! Thus, if a directory is being collected, any reader process will fail
79 //! acquiring the shared lock and will leave the directory alone. Conversely,
80 //! if a collecting process can't acquire the exclusive lock because the
81 //! directory is currently being read from, it will leave collecting that
82 //! directory to another process at a later point in time.
83 //! The exact same scheme is also used when reading the metadata hashes file
84 //! from an extern crate. When a crate is compiled, the hash values of its
85 //! metadata are stored in a file in its session directory. When the
86 //! compilation session of another crate imports the first crate's metadata,
87 //! it also has to read in the accompanying metadata hashes. It thus will access
88 //! the finalized session directory of all crates it links to and while doing
89 //! so, it will also place a read lock on that the respective session directory
90 //! so that it won't be deleted while the metadata hashes are loaded.
91 //!
92 //! ## Preconditions
93 //!
94 //! This system relies on two features being available in the file system in
95 //! order to work really well: file locking and hard linking.
96 //! If hard linking is not available (like on FAT) the data in the cache
97 //! actually has to be copied at the beginning of each session.
98 //! If file locking does not work reliably (like on NFS), some of the
99 //! synchronization will go haywire.
100 //! In both cases we recommend to locate the incremental compilation directory
101 //! on a file system that supports these things.
102 //! It might be a good idea though to try and detect whether we are on an
103 //! unsupported file system and emit a warning in that case. This is not yet
104 //! implemented.
105 
106 use crate::errors;
107 use rustc_data_structures::fx::{FxHashSet, FxIndexSet};
108 use rustc_data_structures::svh::Svh;
109 use rustc_data_structures::unord::{UnordMap, UnordSet};
110 use rustc_data_structures::{base_n, flock};
111 use rustc_errors::ErrorGuaranteed;
112 use rustc_fs_util::{link_or_copy, try_canonicalize, LinkOrCopy};
113 use rustc_session::{Session, StableCrateId};
114 use rustc_span::Symbol;
115 
116 use std::fs as std_fs;
117 use std::io::{self, ErrorKind};
118 use std::path::{Path, PathBuf};
119 use std::time::{Duration, SystemTime, UNIX_EPOCH};
120 
121 use rand::{thread_rng, RngCore};
122 
123 #[cfg(test)]
124 mod tests;
125 
126 const LOCK_FILE_EXT: &str = ".lock";
127 const DEP_GRAPH_FILENAME: &str = "dep-graph.bin";
128 const STAGING_DEP_GRAPH_FILENAME: &str = "dep-graph.part.bin";
129 const WORK_PRODUCTS_FILENAME: &str = "work-products.bin";
130 const QUERY_CACHE_FILENAME: &str = "query-cache.bin";
131 
132 // We encode integers using the following base, so they are shorter than decimal
133 // or hexadecimal numbers (we want short file and directory names). Since these
134 // numbers will be used in file names, we choose an encoding that is not
135 // case-sensitive (as opposed to base64, for example).
136 const INT_ENCODE_BASE: usize = base_n::CASE_INSENSITIVE;
137 
138 /// Returns the path to a session's dependency graph.
dep_graph_path(sess: &Session) -> PathBuf139 pub fn dep_graph_path(sess: &Session) -> PathBuf {
140     in_incr_comp_dir_sess(sess, DEP_GRAPH_FILENAME)
141 }
142 /// Returns the path to a session's staging dependency graph.
143 ///
144 /// On the difference between dep-graph and staging dep-graph,
145 /// see `build_dep_graph`.
staging_dep_graph_path(sess: &Session) -> PathBuf146 pub fn staging_dep_graph_path(sess: &Session) -> PathBuf {
147     in_incr_comp_dir_sess(sess, STAGING_DEP_GRAPH_FILENAME)
148 }
work_products_path(sess: &Session) -> PathBuf149 pub fn work_products_path(sess: &Session) -> PathBuf {
150     in_incr_comp_dir_sess(sess, WORK_PRODUCTS_FILENAME)
151 }
152 /// Returns the path to a session's query cache.
query_cache_path(sess: &Session) -> PathBuf153 pub fn query_cache_path(sess: &Session) -> PathBuf {
154     in_incr_comp_dir_sess(sess, QUERY_CACHE_FILENAME)
155 }
156 
157 /// Locks a given session directory.
lock_file_path(session_dir: &Path) -> PathBuf158 pub fn lock_file_path(session_dir: &Path) -> PathBuf {
159     let crate_dir = session_dir.parent().unwrap();
160 
161     let directory_name = session_dir.file_name().unwrap().to_string_lossy();
162     assert_no_characters_lost(&directory_name);
163 
164     let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect();
165     if dash_indices.len() != 3 {
166         bug!(
167             "Encountered incremental compilation session directory with \
168               malformed name: {}",
169             session_dir.display()
170         )
171     }
172 
173     crate_dir.join(&directory_name[0..dash_indices[2]]).with_extension(&LOCK_FILE_EXT[1..])
174 }
175 
176 /// Returns the path for a given filename within the incremental compilation directory
177 /// in the current session.
in_incr_comp_dir_sess(sess: &Session, file_name: &str) -> PathBuf178 pub fn in_incr_comp_dir_sess(sess: &Session, file_name: &str) -> PathBuf {
179     in_incr_comp_dir(&sess.incr_comp_session_dir(), file_name)
180 }
181 
182 /// Returns the path for a given filename within the incremental compilation directory,
183 /// not necessarily from the current session.
184 ///
185 /// To ensure the file is part of the current session, use [`in_incr_comp_dir_sess`].
in_incr_comp_dir(incr_comp_session_dir: &Path, file_name: &str) -> PathBuf186 pub fn in_incr_comp_dir(incr_comp_session_dir: &Path, file_name: &str) -> PathBuf {
187     incr_comp_session_dir.join(file_name)
188 }
189 
190 /// Allocates the private session directory.
191 ///
192 /// If the result of this function is `Ok`, we have a valid incremental
193 /// compilation session directory. A valid session
194 /// directory is one that contains a locked lock file. It may or may not contain
195 /// a dep-graph and work products from a previous session.
196 ///
197 /// This always attempts to load a dep-graph from the directory.
198 /// If loading fails for some reason, we fallback to a disabled `DepGraph`.
199 /// See [`rustc_interface::queries::dep_graph`].
200 ///
201 /// If this function returns an error, it may leave behind an invalid session directory.
202 /// The garbage collection will take care of it.
203 ///
204 /// [`rustc_interface::queries::dep_graph`]: ../../rustc_interface/struct.Queries.html#structfield.dep_graph
prepare_session_directory( sess: &Session, crate_name: Symbol, stable_crate_id: StableCrateId, ) -> Result<(), ErrorGuaranteed>205 pub fn prepare_session_directory(
206     sess: &Session,
207     crate_name: Symbol,
208     stable_crate_id: StableCrateId,
209 ) -> Result<(), ErrorGuaranteed> {
210     if sess.opts.incremental.is_none() {
211         return Ok(());
212     }
213 
214     let _timer = sess.timer("incr_comp_prepare_session_directory");
215 
216     debug!("prepare_session_directory");
217 
218     // {incr-comp-dir}/{crate-name-and-disambiguator}
219     let crate_dir = crate_path(sess, crate_name, stable_crate_id);
220     debug!("crate-dir: {}", crate_dir.display());
221     create_dir(sess, &crate_dir, "crate")?;
222 
223     // Hack: canonicalize the path *after creating the directory*
224     // because, on windows, long paths can cause problems;
225     // canonicalization inserts this weird prefix that makes windows
226     // tolerate long paths.
227     let crate_dir = match try_canonicalize(&crate_dir) {
228         Ok(v) => v,
229         Err(err) => {
230             return Err(sess.emit_err(errors::CanonicalizePath { path: crate_dir, err }));
231         }
232     };
233 
234     let mut source_directories_already_tried = FxHashSet::default();
235 
236     loop {
237         // Generate a session directory of the form:
238         //
239         // {incr-comp-dir}/{crate-name-and-disambiguator}/s-{timestamp}-{random}-working
240         let session_dir = generate_session_dir_path(&crate_dir);
241         debug!("session-dir: {}", session_dir.display());
242 
243         // Lock the new session directory. If this fails, return an
244         // error without retrying
245         let (directory_lock, lock_file_path) = lock_directory(sess, &session_dir)?;
246 
247         // Now that we have the lock, we can actually create the session
248         // directory
249         create_dir(sess, &session_dir, "session")?;
250 
251         // Find a suitable source directory to copy from. Ignore those that we
252         // have already tried before.
253         let source_directory = find_source_directory(&crate_dir, &source_directories_already_tried);
254 
255         let Some(source_directory) = source_directory else {
256             // There's nowhere to copy from, we're done
257             debug!(
258                 "no source directory found. Continuing with empty session \
259                     directory."
260             );
261 
262             sess.init_incr_comp_session(session_dir, directory_lock, false);
263             return Ok(());
264         };
265 
266         debug!("attempting to copy data from source: {}", source_directory.display());
267 
268         // Try copying over all files from the source directory
269         if let Ok(allows_links) = copy_files(sess, &session_dir, &source_directory) {
270             debug!("successfully copied data from: {}", source_directory.display());
271 
272             if !allows_links {
273                 sess.emit_warning(errors::HardLinkFailed { path: &session_dir });
274             }
275 
276             sess.init_incr_comp_session(session_dir, directory_lock, true);
277             return Ok(());
278         } else {
279             debug!("copying failed - trying next directory");
280 
281             // Something went wrong while trying to copy/link files from the
282             // source directory. Try again with a different one.
283             source_directories_already_tried.insert(source_directory);
284 
285             // Try to remove the session directory we just allocated. We don't
286             // know if there's any garbage in it from the failed copy action.
287             if let Err(err) = safe_remove_dir_all(&session_dir) {
288                 sess.emit_warning(errors::DeletePartial { path: &session_dir, err });
289             }
290 
291             delete_session_dir_lock_file(sess, &lock_file_path);
292             drop(directory_lock);
293         }
294     }
295 }
296 
297 /// This function finalizes and thus 'publishes' the session directory by
298 /// renaming it to `s-{timestamp}-{svh}` and releasing the file lock.
299 /// If there have been compilation errors, however, this function will just
300 /// delete the presumably invalid session directory.
finalize_session_directory(sess: &Session, svh: Option<Svh>)301 pub fn finalize_session_directory(sess: &Session, svh: Option<Svh>) {
302     if sess.opts.incremental.is_none() {
303         return;
304     }
305     // The svh is always produced when incr. comp. is enabled.
306     let svh = svh.unwrap();
307 
308     let _timer = sess.timer("incr_comp_finalize_session_directory");
309 
310     let incr_comp_session_dir: PathBuf = sess.incr_comp_session_dir().clone();
311 
312     if let Some(_) = sess.has_errors_or_delayed_span_bugs() {
313         // If there have been any errors during compilation, we don't want to
314         // publish this session directory. Rather, we'll just delete it.
315 
316         debug!(
317             "finalize_session_directory() - invalidating session directory: {}",
318             incr_comp_session_dir.display()
319         );
320 
321         if let Err(err) = safe_remove_dir_all(&*incr_comp_session_dir) {
322             sess.emit_warning(errors::DeleteFull { path: &incr_comp_session_dir, err });
323         }
324 
325         let lock_file_path = lock_file_path(&*incr_comp_session_dir);
326         delete_session_dir_lock_file(sess, &lock_file_path);
327         sess.mark_incr_comp_session_as_invalid();
328     }
329 
330     debug!("finalize_session_directory() - session directory: {}", incr_comp_session_dir.display());
331 
332     let old_sub_dir_name = incr_comp_session_dir.file_name().unwrap().to_string_lossy();
333     assert_no_characters_lost(&old_sub_dir_name);
334 
335     // Keep the 's-{timestamp}-{random-number}' prefix, but replace the
336     // '-working' part with the SVH of the crate
337     let dash_indices: Vec<_> = old_sub_dir_name.match_indices('-').map(|(idx, _)| idx).collect();
338     if dash_indices.len() != 3 {
339         bug!(
340             "Encountered incremental compilation session directory with \
341               malformed name: {}",
342             incr_comp_session_dir.display()
343         )
344     }
345 
346     // State: "s-{timestamp}-{random-number}-"
347     let mut new_sub_dir_name = String::from(&old_sub_dir_name[..=dash_indices[2]]);
348 
349     // Append the svh
350     base_n::push_str(svh.as_u128(), INT_ENCODE_BASE, &mut new_sub_dir_name);
351 
352     // Create the full path
353     let new_path = incr_comp_session_dir.parent().unwrap().join(new_sub_dir_name);
354     debug!("finalize_session_directory() - new path: {}", new_path.display());
355 
356     match rename_path_with_retry(&*incr_comp_session_dir, &new_path, 3) {
357         Ok(_) => {
358             debug!("finalize_session_directory() - directory renamed successfully");
359 
360             // This unlocks the directory
361             sess.finalize_incr_comp_session(new_path);
362         }
363         Err(e) => {
364             // Warn about the error. However, no need to abort compilation now.
365             sess.emit_warning(errors::Finalize { path: &incr_comp_session_dir, err: e });
366 
367             debug!("finalize_session_directory() - error, marking as invalid");
368             // Drop the file lock, so we can garage collect
369             sess.mark_incr_comp_session_as_invalid();
370         }
371     }
372 
373     let _ = garbage_collect_session_directories(sess);
374 }
375 
delete_all_session_dir_contents(sess: &Session) -> io::Result<()>376 pub fn delete_all_session_dir_contents(sess: &Session) -> io::Result<()> {
377     let sess_dir_iterator = sess.incr_comp_session_dir().read_dir()?;
378     for entry in sess_dir_iterator {
379         let entry = entry?;
380         safe_remove_file(&entry.path())?
381     }
382     Ok(())
383 }
384 
copy_files(sess: &Session, target_dir: &Path, source_dir: &Path) -> Result<bool, ()>385 fn copy_files(sess: &Session, target_dir: &Path, source_dir: &Path) -> Result<bool, ()> {
386     // We acquire a shared lock on the lock file of the directory, so that
387     // nobody deletes it out from under us while we are reading from it.
388     let lock_file_path = lock_file_path(source_dir);
389 
390     // not exclusive
391     let Ok(_lock) = flock::Lock::new(
392         &lock_file_path,
393         false, // don't wait,
394         false, // don't create
395         false,
396     ) else {
397         // Could not acquire the lock, don't try to copy from here
398         return Err(());
399     };
400 
401     let Ok(source_dir_iterator) = source_dir.read_dir() else {
402         return Err(());
403     };
404 
405     let mut files_linked = 0;
406     let mut files_copied = 0;
407 
408     for entry in source_dir_iterator {
409         match entry {
410             Ok(entry) => {
411                 let file_name = entry.file_name();
412 
413                 let target_file_path = target_dir.join(file_name);
414                 let source_path = entry.path();
415 
416                 debug!("copying into session dir: {}", source_path.display());
417                 match link_or_copy(source_path, target_file_path) {
418                     Ok(LinkOrCopy::Link) => files_linked += 1,
419                     Ok(LinkOrCopy::Copy) => files_copied += 1,
420                     Err(_) => return Err(()),
421                 }
422             }
423             Err(_) => return Err(()),
424         }
425     }
426 
427     if sess.opts.unstable_opts.incremental_info {
428         eprintln!(
429             "[incremental] session directory: \
430                   {} files hard-linked",
431             files_linked
432         );
433         eprintln!(
434             "[incremental] session directory: \
435                  {} files copied",
436             files_copied
437         );
438     }
439 
440     Ok(files_linked > 0 || files_copied == 0)
441 }
442 
443 /// Generates unique directory path of the form:
444 /// {crate_dir}/s-{timestamp}-{random-number}-working
generate_session_dir_path(crate_dir: &Path) -> PathBuf445 fn generate_session_dir_path(crate_dir: &Path) -> PathBuf {
446     let timestamp = timestamp_to_string(SystemTime::now());
447     debug!("generate_session_dir_path: timestamp = {}", timestamp);
448     let random_number = thread_rng().next_u32();
449     debug!("generate_session_dir_path: random_number = {}", random_number);
450 
451     let directory_name = format!(
452         "s-{}-{}-working",
453         timestamp,
454         base_n::encode(random_number as u128, INT_ENCODE_BASE)
455     );
456     debug!("generate_session_dir_path: directory_name = {}", directory_name);
457     let directory_path = crate_dir.join(directory_name);
458     debug!("generate_session_dir_path: directory_path = {}", directory_path.display());
459     directory_path
460 }
461 
create_dir(sess: &Session, path: &Path, dir_tag: &str) -> Result<(), ErrorGuaranteed>462 fn create_dir(sess: &Session, path: &Path, dir_tag: &str) -> Result<(), ErrorGuaranteed> {
463     match std_fs::create_dir_all(path) {
464         Ok(()) => {
465             debug!("{} directory created successfully", dir_tag);
466             Ok(())
467         }
468         Err(err) => Err(sess.emit_err(errors::CreateIncrCompDir { tag: dir_tag, path, err })),
469     }
470 }
471 
472 /// Allocate the lock-file and lock it.
lock_directory( sess: &Session, session_dir: &Path, ) -> Result<(flock::Lock, PathBuf), ErrorGuaranteed>473 fn lock_directory(
474     sess: &Session,
475     session_dir: &Path,
476 ) -> Result<(flock::Lock, PathBuf), ErrorGuaranteed> {
477     let lock_file_path = lock_file_path(session_dir);
478     debug!("lock_directory() - lock_file: {}", lock_file_path.display());
479 
480     match flock::Lock::new(
481         &lock_file_path,
482         false, // don't wait
483         true,  // create the lock file
484         true,
485     ) {
486         // the lock should be exclusive
487         Ok(lock) => Ok((lock, lock_file_path)),
488         Err(lock_err) => {
489             let is_unsupported_lock = flock::Lock::error_unsupported(&lock_err).then_some(());
490             Err(sess.emit_err(errors::CreateLock {
491                 lock_err,
492                 session_dir,
493                 is_unsupported_lock,
494                 is_cargo: std::env::var_os("CARGO").map(|_| ()),
495             }))
496         }
497     }
498 }
499 
delete_session_dir_lock_file(sess: &Session, lock_file_path: &Path)500 fn delete_session_dir_lock_file(sess: &Session, lock_file_path: &Path) {
501     if let Err(err) = safe_remove_file(&lock_file_path) {
502         sess.emit_warning(errors::DeleteLock { path: lock_file_path, err });
503     }
504 }
505 
506 /// Finds the most recent published session directory that is not in the
507 /// ignore-list.
find_source_directory( crate_dir: &Path, source_directories_already_tried: &FxHashSet<PathBuf>, ) -> Option<PathBuf>508 fn find_source_directory(
509     crate_dir: &Path,
510     source_directories_already_tried: &FxHashSet<PathBuf>,
511 ) -> Option<PathBuf> {
512     let iter = crate_dir
513         .read_dir()
514         .unwrap() // FIXME
515         .filter_map(|e| e.ok().map(|e| e.path()));
516 
517     find_source_directory_in_iter(iter, source_directories_already_tried)
518 }
519 
find_source_directory_in_iter<I>( iter: I, source_directories_already_tried: &FxHashSet<PathBuf>, ) -> Option<PathBuf> where I: Iterator<Item = PathBuf>,520 fn find_source_directory_in_iter<I>(
521     iter: I,
522     source_directories_already_tried: &FxHashSet<PathBuf>,
523 ) -> Option<PathBuf>
524 where
525     I: Iterator<Item = PathBuf>,
526 {
527     let mut best_candidate = (UNIX_EPOCH, None);
528 
529     for session_dir in iter {
530         debug!("find_source_directory_in_iter - inspecting `{}`", session_dir.display());
531 
532         let directory_name = session_dir.file_name().unwrap().to_string_lossy();
533         assert_no_characters_lost(&directory_name);
534 
535         if source_directories_already_tried.contains(&session_dir)
536             || !is_session_directory(&directory_name)
537             || !is_finalized(&directory_name)
538         {
539             debug!("find_source_directory_in_iter - ignoring");
540             continue;
541         }
542 
543         let timestamp = extract_timestamp_from_session_dir(&directory_name).unwrap_or_else(|_| {
544             bug!("unexpected incr-comp session dir: {}", session_dir.display())
545         });
546 
547         if timestamp > best_candidate.0 {
548             best_candidate = (timestamp, Some(session_dir.clone()));
549         }
550     }
551 
552     best_candidate.1
553 }
554 
is_finalized(directory_name: &str) -> bool555 fn is_finalized(directory_name: &str) -> bool {
556     !directory_name.ends_with("-working")
557 }
558 
is_session_directory(directory_name: &str) -> bool559 fn is_session_directory(directory_name: &str) -> bool {
560     directory_name.starts_with("s-") && !directory_name.ends_with(LOCK_FILE_EXT)
561 }
562 
is_session_directory_lock_file(file_name: &str) -> bool563 fn is_session_directory_lock_file(file_name: &str) -> bool {
564     file_name.starts_with("s-") && file_name.ends_with(LOCK_FILE_EXT)
565 }
566 
extract_timestamp_from_session_dir(directory_name: &str) -> Result<SystemTime, ()>567 fn extract_timestamp_from_session_dir(directory_name: &str) -> Result<SystemTime, ()> {
568     if !is_session_directory(directory_name) {
569         return Err(());
570     }
571 
572     let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect();
573     if dash_indices.len() != 3 {
574         return Err(());
575     }
576 
577     string_to_timestamp(&directory_name[dash_indices[0] + 1..dash_indices[1]])
578 }
579 
timestamp_to_string(timestamp: SystemTime) -> String580 fn timestamp_to_string(timestamp: SystemTime) -> String {
581     let duration = timestamp.duration_since(UNIX_EPOCH).unwrap();
582     let micros = duration.as_secs() * 1_000_000 + (duration.subsec_nanos() as u64) / 1000;
583     base_n::encode(micros as u128, INT_ENCODE_BASE)
584 }
585 
string_to_timestamp(s: &str) -> Result<SystemTime, ()>586 fn string_to_timestamp(s: &str) -> Result<SystemTime, ()> {
587     let micros_since_unix_epoch = u64::from_str_radix(s, INT_ENCODE_BASE as u32);
588 
589     if micros_since_unix_epoch.is_err() {
590         return Err(());
591     }
592 
593     let micros_since_unix_epoch = micros_since_unix_epoch.unwrap();
594 
595     let duration = Duration::new(
596         micros_since_unix_epoch / 1_000_000,
597         1000 * (micros_since_unix_epoch % 1_000_000) as u32,
598     );
599     Ok(UNIX_EPOCH + duration)
600 }
601 
crate_path(sess: &Session, crate_name: Symbol, stable_crate_id: StableCrateId) -> PathBuf602 fn crate_path(sess: &Session, crate_name: Symbol, stable_crate_id: StableCrateId) -> PathBuf {
603     let incr_dir = sess.opts.incremental.as_ref().unwrap().clone();
604 
605     let stable_crate_id = base_n::encode(stable_crate_id.as_u64() as u128, INT_ENCODE_BASE);
606 
607     let crate_name = format!("{}-{}", crate_name, stable_crate_id);
608     incr_dir.join(crate_name)
609 }
610 
assert_no_characters_lost(s: &str)611 fn assert_no_characters_lost(s: &str) {
612     if s.contains('\u{FFFD}') {
613         bug!("Could not losslessly convert '{}'.", s)
614     }
615 }
616 
is_old_enough_to_be_collected(timestamp: SystemTime) -> bool617 fn is_old_enough_to_be_collected(timestamp: SystemTime) -> bool {
618     timestamp < SystemTime::now() - Duration::from_secs(10)
619 }
620 
621 /// Runs garbage collection for the current session.
garbage_collect_session_directories(sess: &Session) -> io::Result<()>622 pub fn garbage_collect_session_directories(sess: &Session) -> io::Result<()> {
623     debug!("garbage_collect_session_directories() - begin");
624 
625     let session_directory = sess.incr_comp_session_dir();
626     debug!(
627         "garbage_collect_session_directories() - session directory: {}",
628         session_directory.display()
629     );
630 
631     let crate_directory = session_directory.parent().unwrap();
632     debug!(
633         "garbage_collect_session_directories() - crate directory: {}",
634         crate_directory.display()
635     );
636 
637     // First do a pass over the crate directory, collecting lock files and
638     // session directories
639     let mut session_directories = FxIndexSet::default();
640     let mut lock_files = UnordSet::default();
641 
642     for dir_entry in crate_directory.read_dir()? {
643         let Ok(dir_entry) = dir_entry else {
644             // Ignore any errors
645             continue;
646         };
647 
648         let entry_name = dir_entry.file_name();
649         let entry_name = entry_name.to_string_lossy();
650 
651         if is_session_directory_lock_file(&entry_name) {
652             assert_no_characters_lost(&entry_name);
653             lock_files.insert(entry_name.into_owned());
654         } else if is_session_directory(&entry_name) {
655             assert_no_characters_lost(&entry_name);
656             session_directories.insert(entry_name.into_owned());
657         } else {
658             // This is something we don't know, leave it alone
659         }
660     }
661     session_directories.sort();
662 
663     // Now map from lock files to session directories
664     let lock_file_to_session_dir: UnordMap<String, Option<String>> = lock_files
665         .into_items()
666         .map(|lock_file_name| {
667             assert!(lock_file_name.ends_with(LOCK_FILE_EXT));
668             let dir_prefix_end = lock_file_name.len() - LOCK_FILE_EXT.len();
669             let session_dir = {
670                 let dir_prefix = &lock_file_name[0..dir_prefix_end];
671                 session_directories.iter().find(|dir_name| dir_name.starts_with(dir_prefix))
672             };
673             (lock_file_name, session_dir.map(String::clone))
674         })
675         .into();
676 
677     // Delete all lock files, that don't have an associated directory. They must
678     // be some kind of leftover
679     for (lock_file_name, directory_name) in
680         lock_file_to_session_dir.items().into_sorted_stable_ord()
681     {
682         if directory_name.is_none() {
683             let Ok(timestamp) = extract_timestamp_from_session_dir(lock_file_name) else {
684                 debug!(
685                     "found lock-file with malformed timestamp: {}",
686                     crate_directory.join(&lock_file_name).display()
687                 );
688                 // Ignore it
689                 continue;
690             };
691 
692             let lock_file_path = crate_directory.join(&*lock_file_name);
693 
694             if is_old_enough_to_be_collected(timestamp) {
695                 debug!(
696                     "garbage_collect_session_directories() - deleting \
697                     garbage lock file: {}",
698                     lock_file_path.display()
699                 );
700                 delete_session_dir_lock_file(sess, &lock_file_path);
701             } else {
702                 debug!(
703                     "garbage_collect_session_directories() - lock file with \
704                     no session dir not old enough to be collected: {}",
705                     lock_file_path.display()
706                 );
707             }
708         }
709     }
710 
711     // Filter out `None` directories
712     let lock_file_to_session_dir: UnordMap<String, String> = lock_file_to_session_dir
713         .into_items()
714         .filter_map(|(lock_file_name, directory_name)| directory_name.map(|n| (lock_file_name, n)))
715         .into();
716 
717     // Delete all session directories that don't have a lock file.
718     for directory_name in session_directories {
719         if !lock_file_to_session_dir.items().any(|(_, dir)| *dir == directory_name) {
720             let path = crate_directory.join(directory_name);
721             if let Err(err) = safe_remove_dir_all(&path) {
722                 sess.emit_warning(errors::InvalidGcFailed { path: &path, err });
723             }
724         }
725     }
726 
727     // Now garbage collect the valid session directories.
728     let deletion_candidates =
729         lock_file_to_session_dir.items().filter_map(|(lock_file_name, directory_name)| {
730             debug!("garbage_collect_session_directories() - inspecting: {}", directory_name);
731 
732             let Ok(timestamp) = extract_timestamp_from_session_dir(directory_name) else {
733             debug!(
734                 "found session-dir with malformed timestamp: {}",
735                 crate_directory.join(directory_name).display()
736             );
737             // Ignore it
738             return None;
739         };
740 
741             if is_finalized(directory_name) {
742                 let lock_file_path = crate_directory.join(lock_file_name);
743                 match flock::Lock::new(
744                     &lock_file_path,
745                     false, // don't wait
746                     false, // don't create the lock-file
747                     true,
748                 ) {
749                     // get an exclusive lock
750                     Ok(lock) => {
751                         debug!(
752                             "garbage_collect_session_directories() - \
753                             successfully acquired lock"
754                         );
755                         debug!(
756                             "garbage_collect_session_directories() - adding \
757                             deletion candidate: {}",
758                             directory_name
759                         );
760 
761                         // Note that we are holding on to the lock
762                         return Some((
763                             (timestamp, crate_directory.join(directory_name)),
764                             Some(lock),
765                         ));
766                     }
767                     Err(_) => {
768                         debug!(
769                             "garbage_collect_session_directories() - \
770                             not collecting, still in use"
771                         );
772                     }
773                 }
774             } else if is_old_enough_to_be_collected(timestamp) {
775                 // When cleaning out "-working" session directories, i.e.
776                 // session directories that might still be in use by another
777                 // compiler instance, we only look a directories that are
778                 // at least ten seconds old. This is supposed to reduce the
779                 // chance of deleting a directory in the time window where
780                 // the process has allocated the directory but has not yet
781                 // acquired the file-lock on it.
782 
783                 // Try to acquire the directory lock. If we can't, it
784                 // means that the owning process is still alive and we
785                 // leave this directory alone.
786                 let lock_file_path = crate_directory.join(lock_file_name);
787                 match flock::Lock::new(
788                     &lock_file_path,
789                     false, // don't wait
790                     false, // don't create the lock-file
791                     true,
792                 ) {
793                     // get an exclusive lock
794                     Ok(lock) => {
795                         debug!(
796                             "garbage_collect_session_directories() - \
797                             successfully acquired lock"
798                         );
799 
800                         delete_old(sess, &crate_directory.join(directory_name));
801 
802                         // Let's make it explicit that the file lock is released at this point,
803                         // or rather, that we held on to it until here
804                         drop(lock);
805                     }
806                     Err(_) => {
807                         debug!(
808                             "garbage_collect_session_directories() - \
809                             not collecting, still in use"
810                         );
811                     }
812                 }
813             } else {
814                 debug!(
815                     "garbage_collect_session_directories() - not finalized, not \
816                     old enough"
817                 );
818             }
819             None
820         });
821     let deletion_candidates = deletion_candidates.into();
822 
823     // Delete all but the most recent of the candidates
824     all_except_most_recent(deletion_candidates).into_items().all(|(path, lock)| {
825         debug!("garbage_collect_session_directories() - deleting `{}`", path.display());
826 
827         if let Err(err) = safe_remove_dir_all(&path) {
828             sess.emit_warning(errors::FinalizedGcFailed { path: &path, err });
829         } else {
830             delete_session_dir_lock_file(sess, &lock_file_path(&path));
831         }
832 
833         // Let's make it explicit that the file lock is released at this point,
834         // or rather, that we held on to it until here
835         drop(lock);
836         true
837     });
838 
839     Ok(())
840 }
841 
delete_old(sess: &Session, path: &Path)842 fn delete_old(sess: &Session, path: &Path) {
843     debug!("garbage_collect_session_directories() - deleting `{}`", path.display());
844 
845     if let Err(err) = safe_remove_dir_all(&path) {
846         sess.emit_warning(errors::SessionGcFailed { path: &path, err });
847     } else {
848         delete_session_dir_lock_file(sess, &lock_file_path(&path));
849     }
850 }
851 
all_except_most_recent( deletion_candidates: UnordMap<(SystemTime, PathBuf), Option<flock::Lock>>, ) -> UnordMap<PathBuf, Option<flock::Lock>>852 fn all_except_most_recent(
853     deletion_candidates: UnordMap<(SystemTime, PathBuf), Option<flock::Lock>>,
854 ) -> UnordMap<PathBuf, Option<flock::Lock>> {
855     let most_recent = deletion_candidates.items().map(|(&(timestamp, _), _)| timestamp).max();
856 
857     if let Some(most_recent) = most_recent {
858         deletion_candidates
859             .into_items()
860             .filter(|&((timestamp, _), _)| timestamp != most_recent)
861             .map(|((_, path), lock)| (path, lock))
862             .collect()
863     } else {
864         UnordMap::default()
865     }
866 }
867 
868 /// Since paths of artifacts within session directories can get quite long, we
869 /// need to support deleting files with very long paths. The regular
870 /// WinApi functions only support paths up to 260 characters, however. In order
871 /// to circumvent this limitation, we canonicalize the path of the directory
872 /// before passing it to std::fs::remove_dir_all(). This will convert the path
873 /// into the '\\?\' format, which supports much longer paths.
safe_remove_dir_all(p: &Path) -> io::Result<()>874 fn safe_remove_dir_all(p: &Path) -> io::Result<()> {
875     let canonicalized = match try_canonicalize(p) {
876         Ok(canonicalized) => canonicalized,
877         Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(()),
878         Err(err) => return Err(err),
879     };
880 
881     std_fs::remove_dir_all(canonicalized)
882 }
883 
safe_remove_file(p: &Path) -> io::Result<()>884 fn safe_remove_file(p: &Path) -> io::Result<()> {
885     let canonicalized = match try_canonicalize(p) {
886         Ok(canonicalized) => canonicalized,
887         Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(()),
888         Err(err) => return Err(err),
889     };
890 
891     match std_fs::remove_file(canonicalized) {
892         Err(err) if err.kind() == io::ErrorKind::NotFound => Ok(()),
893         result => result,
894     }
895 }
896 
897 // On Windows the compiler would sometimes fail to rename the session directory because
898 // the OS thought something was still being accessed in it. So we retry a few times to give
899 // the OS time to catch up.
900 // See https://github.com/rust-lang/rust/issues/86929.
rename_path_with_retry(from: &Path, to: &Path, mut retries_left: usize) -> std::io::Result<()>901 fn rename_path_with_retry(from: &Path, to: &Path, mut retries_left: usize) -> std::io::Result<()> {
902     loop {
903         match std_fs::rename(from, to) {
904             Ok(()) => return Ok(()),
905             Err(e) => {
906                 if retries_left > 0 && e.kind() == ErrorKind::PermissionDenied {
907                     // Try again after a short waiting period.
908                     std::thread::sleep(Duration::from_millis(50));
909                     retries_left -= 1;
910                 } else {
911                     return Err(e);
912                 }
913             }
914         }
915     }
916 }
917