Microsoft Dev Blogs

System.IO 的便利性

thumbnail

The Convenience of System.IO

Reading and writing files is a common task in many applications. The System.IO namespace provides APIs for accessing files in .NET, making it easier for developers to work with files. In this blog post, we will explore the convenience and performance of using System.IO to read text files, with the help of the System.Text API.

The article begins by introducing the series of blog posts on the convenience of .NET. It explains that the System.IO APIs, such as File, FileInfo, FileStream, and related types, provide a lot of heavy lifting for .NET developers who need to access files. The article then focuses on the convenience and performance of reading text files using System.IO and the System.Text API.

The author uses the example of counting lines, words, and characters in a text file to demonstrate the convenience and performance of the different file APIs. The benchmark tests the following File APIs (and their related APIs):

  • File.OpenHandle and RandomAccess.Read
  • File.Open and FileStream.Read
  • File.OpenText and StreamReader.Read
  • File.ReadLines and IEnumerable<string>
  • File.ReadAllLines and string[]

The APIs are listed in order of convenience, from the most control to the most convenient. The author also uses the Encoding and Rune types from the System.Text namespace in the benchmark tests. They also introduce a new SearchValues class to compare its performance against using Span<char> with Span<char>.IndexOfAny, and the results show a significant advantage.

The article then discusses the target app for the benchmark tests, which is a program that counts the number of lines, words, and characters in a text file. The app is modeled after the behavior of the "wc" command on Unix systems. The code for the app follows the principles of the "wc" command.

The article presents the results of the benchmark tests, which measure the code lines of each implementation, the execution speed, and the memory usage. The tests were performed using .NET 8, which was close to the final GA version at the time of writing.

The code lines chart shows that the benchmark tests can be categorized into two clusters, with slight differences to accommodate the different APIs. The chart also includes the code lines for the wc implementation, which is much longer compared to the .NET implementations. However, it also has additional functionality beyond the scope of this article.

The article then compares the functionality of the C# implementation to the wc implementation, and shows that the results are almost identical, with just a difference of one word in the total word count.

Next, the article looks at the performance of scanning a small text file with just 1KB of text. The results show that the difference in execution time is within 1 microsecond, which is negligible for most applications.

The article also compares the memory usage of the different file APIs when working with the same small text file. The results show that the byte-based APIs have a fixed cost, while the string-based APIs have variable memory requirements depending on the size of the document.

The article then discusses the performance of reading a larger text file of around 600KB. The results show that the byte and char return APIs have similar performance, slightly above 1 millisecond. There is also a difference between File.ReadLine and File.ReadAllLines, but the performance gap is only around 2 milliseconds for a file of this size.

The article concludes by highlighting the convenience of the higher-level APIs and the excellent performance they offer. It also mentions the use of ArrayPool arrays to demonstrate the difference in memory usage between the higher-level and lower-level APIs. Overall, the article emphasizes the convenience and performance benefits of using System.IO for reading text files in .NET applications.