{"version":"https://jsonfeed.org/version/1","title":"Kyrylo Khlopko","home_page_url":"https://khlopko.com/","feed_url":"https://khlopko.com/feed.json","description":"Notes on software engineering, Swift, Go, Rust, and building better tools.","favicon":"https://khlopko.com//assets/favicon.ico","expired":false,"author":{"name":"Kyrylo Khlopko","url":"https://khlopko.com/"},"items":[{"id":"3d2cad3806951d921e98339da4967ccf2596acc2","title":"1brc in Swift","summary":"","content_text":"There was (and probably will be for a while) a bit of interest for the challenge initially posted for Java, yet it turned out to be an interesting task and spread all over. The One Billion Rows Challenge. Here - gunnarmorling/1brc. I have heard about it around a month ago, and added it to a \u0026ldquo;some day\u0026rdquo; list. Now I finally tried myself in it.\nShort on a challenge itself. In essence, it is easy task - read rows of well-formatted data line by line and calculate a few measurements, that\u0026rsquo;s like beginning of programming tasks. But with that misleading simplicity comes a nuance - there are 1 billion of lines to process, and make it as fast as you can. When you have something measured in billions, the complexity quickly goes into the outer space. A bit of math: with every nanosecond of slow down on a line processing program takes a second longer. How often do you think about the program performance in terms of nanoseconds?\nSpeaking of me, I have a little experience with optimisations to such level. I think about the code in terms of its performance with every task, but there is rarely a need to process such large collections of data in one take and fast. They also costly in time, so premature optimisation has never been good, just reasonably fast is good enought. Given that, my knowledge of data structures I\u0026rsquo;m going to use, file reading, memory access and other things were the only one who helped in solving that task.\nI didn\u0026rsquo;t captured all the steps like from the most naive implementation to the best I have reached, but most of that. The most unoptimised version I came up initially by reading line by line should took around 30 to 40 minutes to complete, and that\u0026rsquo;s an approximation, because faster implementation could be written in half of that time.\nRecap: The Challenge The challenge is to read a file with a billion rows, each row containing a city name and a temperature. You need to calculate min, max and average temperature for each city. The file is well-formatted, so we can assume that each line contains a city name and a temperature (with only one fraction digit) separated by a semicolon. The file is around 13GB in size. The output should be sorted by city name. The challenge is to make it as fast as possible. Here is a nice illustration from the original repo:\nReplacing \u0026ldquo;Java\u0026rdquo; everywhere with \u0026ldquo;Swift\u0026rdquo;, we are taking off now.\nTest conditions Most of the time I\u0026rsquo;ve been testing on my MacBook M1 without power plugged in. Despite the fact I do not have any power-save modes turned on, Apple is decreasing CPU performance slightly. So when I\u0026rsquo;ve started running timings on plugged in laptop, I have had a speed up around 0.8s to every test. For the first implementations it was insignificant difference, but as running time decrease it gives a visible change. The numbers I am presenting here has been measured using hyperfine tool with only terminal running (and bunch of staff macOS running in background), plugged in to a power, MacBook M1 Pro with 16GB of memory and macOS Sonoma 14.4.\nTake one: Not so naive Skipping aside lost naive version, let\u0026rsquo;s think about obvious and easy to do facts for the start. For example, how are we going to represent data once it has been parsed? We need to calculate min, max and average. First two can be calculated at the time of reading, for average we need all values. In the most naive version it would be an array of these value, but that is a waste of time and space. To calculate average we only need a sum and a count, just two numbers:\nstruct Measurement { let min: Double let max: Double let avg: Double let count: Int } Right from the start we can also make one more improvement: parsing doubles. That a much more complex task than parsing integer and perform operations on it. We know that all the numbers in the input has exactly one digit in the decimal part, so we can work with integers most of the time, converting them into doubles only in the final steps:\nstruct Measurement { let name: String let min: Int let max: Int let avg: Int let count: Int } All the file needs to be loaded into a memory then. Instead of friction between file system and memory, we load it in a Data to have faster access to it. System may decide to store that in a swap, but it is anyway faster that reading from the file.\nWe store results in a dictionary Dictionary\u0026lt;Int, Measurement\u0026gt;. Key is not a String, because use of a city name as a key is not efficient for two reasons. First, you need to parse the name into a string (or at least bytes buffer, yet still not effective). Second, default hashing is not so fast and will reiterate over the name more likely. To solve that, we can compute hash while we parse the name. Then we won\u0026rsquo;t need to convert name to a string each time, but only the first one. I only later realized that parsing bytes into a string could be done much later, but if you think about it - there is not a lot of difference. We know that there are only 413 different stations, and more demanding version has 10K of them, yet neither of this will make a significant change here due to reading it only one time.\nFinally, it is important to allocate dictionary with capacity beforehand. We know how much stations are there, so we can use it to benefit by reducing allocations as we add elements to a dictionary, so we allocate each dictionary with initial capacity of 500 keys:\nvar result = Dictionary\u0026lt;Int, Measurement\u0026gt;(minimumCapacity: 500) You can check the whole code (a bit messy though) at this commit.\nThat gives us running time around 2 minutes 45 seconds. Not bad for such simple thing we\u0026rsquo;ve done and reading file line by line.\nCPU and chunks We are clearly not using the full potential of the modern computers with our implementation. For example, my MacBook Pro M1 has 10 cores, and we are using only 1 for processing 1 line at the time. Let\u0026rsquo;s change this.\nTo parallelise processing effectively, we need to split a whole file into chunks, so they could be processed independetly. We don\u0026rsquo;t know yet how many chunks will give us the best performace, so let\u0026rsquo;s make number of chunks a configuarable parameter and play with it later.\nlet chunkCount = 10 // setting to number of cores for start let chunkSize = fileSize / chunksCount Currently, we are going to load 1.3GB of data for each chunk and process it on one of the cores. Now, the lines in the file has different lengths and each chunk is more likely to end on an arbitrary position in the line, more likely to be somewhere in the middle. But we need to have clear line boundaries, so using chunkSize as starting point we are going to adjust it to be exactly till the end of line:\nvar offset = 0 func nextChar() -\u0026gt; UInt8 { // read char from file handle and update offset } while offset \u0026lt;= maxOffset { let chunkStart = offset offset += chunkSize // 10 is ASCII code for new line while offset \u0026lt;= maxOffset \u0026amp;\u0026amp; currChar != 10 { currChar = nextChar() } // restore boundary if needed if offset \u0026gt; maxOffset { offset = maxOffset } let currChunkSize = offset - chunkStart process(start: chunkStart, size: currChunkSize) // read char after new line currChar = nextChar() } The code above reads till the end of a line after we skipped to the end of a chunk, which we can consider to be extremely small amount of work - line is a city name (presumably around 40 characters max) and temperature (up to 5 chars in total), so in worst case we have to run inner while loop ~45 times, and with 10 chunks it is 450 for the worst case scenario. If we will increase number of chunks significatly, e.g. to a few thousands, it will take 45 * 2000 = 90_000 iterations. That is a still a small amount of time (~0.05 seconds), which could be a subject for optimisation if nothing else left to optimise, but we can consider this as irrelative, since in real case it is more likely to by around half of that time anyway.\nTo run processing of chunks we are going to use new Swift Concurrency capabilities and task group. The implementation is already has taken care of not scheduling too much tasks avoiding thread explosion, still we have to be mindfull of chunks being not too small. The decompose it even further, we are going to introduce two actors: one for reading a chunk of data, second for parsing it into a dictiorary as partial result.\ntypealias PartialResult = [Int64: Measurement] while offset \u0026lt;= maxOffset { // ... } let result = await withTaskGroup(of: PartialResult.self) { group in for chunk in chunks { group.addTask { let reader = ChunkReader(fd: fd) let data = await reader.run(start: chunkStart, size: currChunkSize) let parser = LineParser() let partialResult = await parser.run(data: data) return partialResult } } var result = PartialResult(minimumCapacity: 500) for await partial in group { merge(partial, into: result) } return result } actor ChunkReader { private let fd: FileDescriptor init(fd: FileDescriptor) { self.fd = fd } func run(start: Int, size: Int) -\u0026gt; Data { // read raw data } } actor LineParser { func run(data: Data) -\u0026gt; PartialResult { // parse lines from the loaded data } } This will allow us to utilise CPU at max by constantly running reading or parsing, without blocking. Actors there aren\u0026rsquo;t actually protecting any shared state, but act as isolation regions for running tasks.\nWe also need to adjust our Measurement structure to contain Data for name instead of conversion to a string. We will only convert it to a string when the result needs to be displayed.\nstruct Measurement { let name: Data // ... } As a result, after some play with chunks number, by using 1024 chunks, the time has been reduced drastically to ~25 seconds. The implementation is available here.\nThat is a good result so far, but we can do better, there is still a room for improvement.\nImmediate scheduling and more effective file reading We create group to run tasks after chunks has been collected. And chunk are collected sequentially, meanining we are loosing valuable time waiting for them all to be defined first. Instead, we are going to put scanning for chunks as part of a task group:\nlet result = await withTaskGroup(of: PartialResult.self) { group in while offset \u0026lt;= maxOffset { // ... group.addTask { // ... } } } That isn\u0026rsquo;t going to get us a lot of speed up, but still we are using resources more consciously.\nThe file reading can be improved as well. At the time we were using FileHandle abstraction to read from the file, but it is a wrapper over C API, that adds overhead, and uses Data type, which might be not as efficient as we expect it to be, since we actually just need an array of UInt8, which is returned by C API, so we can avoid needless conversions back and forth.\nSo what we are going to do is replace FileHandle with fopen:\nlet file = fopen(path, \u0026#34;r\u0026#34;)! Then, read into a byte array:\nlet buffer = UnsafeMutableRawPointer.allocate(byteCount: chunkSize, alignment: MemoryLayout\u0026lt;UInt8\u0026gt;.alignment) fread(buffer, 1, chunkSize, file) let rawBytes = buffer.bindMemory(to: UInt8.self, capacity: chunkSize) let byteArray = Array(UnsafeBufferPointer(start: rawBytes, count: chunkSize)) buffer.deallocate() Since we aren\u0026rsquo;t using Data anywhere, Measurement has to change:\nstruct Measurement { let name: ArraySlice\u0026lt;UInt8\u0026gt; // ... } Use of ArraySlice allows us avoid copy of a memory each time, which is clearly saves us a lot.\nWith that changes taken into effect, the time has been further reduced to 10s. Implementation is here.\nRunning out of ideas At this point I have almost gone out of improvements. A few tweaks has made runnig time to decrease to 9s:\nChange chunks agait to 2048 now, since we have faster processing, we can benefit from more chunks. Fix hashing for final result collection. Add inlining to some of the methods to be forced. Simplify temperature parsing. Later, I was suspecting that hashing I was using is giving me collisions, so I\u0026rsquo;ve changed it to have FNV-1a algorithm implementation, which we discuss later. That haven\u0026rsquo;t made any performance improvements.\nState of the code at this point can be found here.\nFile reading: one more time If we take a look at the reading of the file, we can notice that it doesn\u0026rsquo;t benefit a lot from concurrency, we still have exclusive access and shared state in form of a pointer. That is a bottleneck in our reading part. We also open file for each chunk we are processing, because file handle cannot be passed safely between concurrently running code. On the other hand, there is a file descriptor, which is a safe alternative to use for faster concurrent access.\nWe are going to replace our file reading to use file descriptor:\nCreate a single file descriptor, so we open file only once, then share it among readers. Use pread API equivalent on FileDescriptor to read from file concurrently. Limit number of readers to the number of cores. Most of the changes aren\u0026rsquo;t hard to implement, yet we need to address fgetc we\u0026rsquo;ve been using before. Due to a stateful behaviour, it have been advancing automatically for us. And now we are going to avoid modifying descriptor state. To handle that, we create a replication of this function:\nfunc getc() -\u0026gt; UInt8 { let buffer = UnsafeMutableRawBufferPointer.allocate(byteCount: 1, alignment: MemoryLayout\u0026lt;UInt8\u0026gt;.alignment) defer { buffer.deallocate() } let bytesRead = try! fd.read(fromAbsoluteOffset: offset, into: buffer) offset += bytesRead return buffer.bindMemory(to: UInt8.self)[0] } And replace calls to fgetc() with getc().\nReading using descriptors given around 2s of improvement with the time down to ~7s to process 1 billion rows. With power off measurements in README this version sits by the link.\n10K and capacity The challenge has another more demanding dataset, where it is a 10k different stations instead of 413 in the default. It was interesting for me how well implementation will perform on this dataset, since I\u0026rsquo;ve made some assumtions on dictionary capacity. Without capacity modifications, it takes around twice more to process - 12 seconds.\nAt this point I have modified capacity to 11k (slightly more than set), and remembered quality of many of the containers - they grow by doubling their underlying storage. So instead of setting to somewhat random number, I\u0026rsquo;d better use power of 2. 14 is the least power greater than 10k, so here we go: let capacity = 1 \u0026lt;\u0026lt; 14. This little change has lead to a drasticall improvement in running time reduced to 8.8s. We just sliced off 3 seconds just by using better capacity.\nAt this point I have already had an assumption that despite setting initial capacity for dictionary to 500, it isn\u0026rsquo;t enough. So I\u0026rsquo;ve tried to run 10k improvement on default dataset, and have got 0.5s improvement with running time dropped to 6.5s.\nFinal implementation: click here.\nIdeas for further improvement As for now, I have mostly gone out of ideas on how to improve it further. Reading is fast, the main bottlenecks are in parsing.\nMore likely, I we could avoid conversion to [UInt8] and use pointer we advance over, we would be able to reduce time, since there won\u0026rsquo;t be an overhead for array creation and access checks it performs on subscript. It happend to be more complex task that I thought and as for now it still an idea to check.\nThe second bottleneck is dictionary. Despite its implementation being effective, and we are using effective hashing paired with thoughful memory allocations, it is still costs more than array access, plus we have to create new structure each time. Hashing algorithm we are using are not producing collisions on our set of data, so we can modify it to act as array indexes and migrate from a dictionary to an array. I would expect it to give also huge time improvement, if these assumptions will work.\nTemperature has constraints with min and max values from -99.9 to 99.9, plus only one fraction digit. We already do benefit from 1 fraction digit by setting this implicitly in code, but the range is still has a room for improvement.\nRest of possible improvements supposedly will include SIL generation analysis, looking at assembly code, use of SIMD, and so on. It would be interesting to dive into that some day, but as for now there are still options to try before that.\nFinal thoughts It turned out to be complex, but not so much, task for me. 6.5 seconds on 1 billion rows seems to be a pretty good result. Such tasks make you learn some internals of the language you are not bump into very often if such performance is not your main concern.\n","content_html":"\u003cp\u003eThere was (and probably will be for a while) a bit of interest for the challenge initially posted for Java, yet it\nturned out to be an interesting task and spread all over. The One Billion Rows Challenge.\nHere - \u003ca href=\"https://github.com/gunnarmorling/1brc\"\u003egunnarmorling/1brc\u003c/a\u003e. I have heard about it around a month ago, and added it to\na \u0026ldquo;some day\u0026rdquo; list. Now I finally tried myself in it.\u003c/p\u003e\n\u003cp\u003eShort on a challenge itself. In essence, it is easy task - read rows of well-formatted data line by line and calculate\na few measurements, that\u0026rsquo;s like beginning of programming tasks. But with that misleading simplicity comes a nuance -\nthere are \u003cstrong\u003e1 billion\u003c/strong\u003e of lines to process, and make it as fast as you can. When you have something measured in billions,\nthe complexity quickly goes into the outer space. A bit of math: with every \u003cem\u003enanosecond\u003c/em\u003e of slow down on a line processing\nprogram takes a second longer. How often do you think about the program performance in terms of nanoseconds?\u003c/p\u003e\n\u003cp\u003eSpeaking of me, I have a little experience with optimisations to such level. I think about the code in terms of its performance\nwith every task, but there is rarely a need to process such large collections of data in one take and fast. They also costly\nin time, so premature optimisation has never been good, just reasonably fast is good enought. Given that, my knowledge of\ndata structures I\u0026rsquo;m going to use, file reading, memory access and other things were the only one who helped in solving that task.\u003c/p\u003e\n\u003cp\u003eI didn\u0026rsquo;t captured all the steps like from the most naive implementation to the best I have reached, but most of that.\nThe most unoptimised version I came up initially by reading line by line should took around 30 to 40 minutes to complete,\nand that\u0026rsquo;s an approximation, because faster implementation could be written in half of that time.\u003c/p\u003e\n\u003ch1 id=\"recap-the-challenge\"\u003eRecap: The Challenge\u003c/h1\u003e\n\u003cp\u003eThe challenge is to read a file with a billion rows, each row containing a city name and a temperature. You need to\ncalculate min, max and average temperature for each city. The file is well-formatted, so we can assume that each line\ncontains a city name and a temperature (with only one fraction digit) separated by a semicolon. The file is around 13GB\nin size. The output should be sorted by city name. The challenge is to make it as fast as possible. Here is a nice illustration\nfrom the original repo:\u003c/p\u003e\n\u003cp\u003e\u003cimg\n  src=\"1brc.png\"\n  alt=\"\"\n  loading=\"lazy\"\n  decoding=\"async\"\n  class=\"full-width\"\n/\u003e\n\n\u003c/p\u003e\n\u003cp\u003eReplacing \u0026ldquo;Java\u0026rdquo; everywhere with \u0026ldquo;Swift\u0026rdquo;, we are taking off now.\u003c/p\u003e\n\u003ch1 id=\"test-conditions\"\u003eTest conditions\u003c/h1\u003e\n\u003cp\u003eMost of the time I\u0026rsquo;ve been testing on my MacBook M1 without power plugged in. Despite the fact I do not have any power-save\nmodes turned on, Apple is decreasing CPU performance slightly. So when I\u0026rsquo;ve started running timings on plugged in laptop,\nI have had a speed up around 0.8s to every test. For the first implementations it was insignificant difference, but as\nrunning time decrease it gives a visible change. The numbers I am presenting here has been measured using \u003ccode\u003ehyperfine\u003c/code\u003e tool\nwith only terminal running (and bunch of staff macOS running in background), plugged in to a power, MacBook M1 Pro with\n16GB of memory and macOS Sonoma 14.4.\u003c/p\u003e\n\u003ch1 id=\"take-one-not-so-naive\"\u003eTake one: Not so naive\u003c/h1\u003e\n\u003cp\u003eSkipping aside lost naive version, let\u0026rsquo;s think about obvious and easy to do facts for the start. For example, how are\nwe going to represent data once it has been parsed? We need to calculate min, max and average. First two can be calculated\nat the time of reading, for average we need all values. In the most naive version it would be an array of these value,\nbut that is a waste of time and space. To calculate average we only need a sum and a count, just two numbers:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003estruct\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003eMeasurement\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emin\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eDouble\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emax\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eDouble\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eavg\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eDouble\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecount\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eRight from the start we can also make one more improvement: parsing doubles. That a much more complex task than parsing\ninteger and perform operations on it. We know that all the numbers in the input has exactly one digit in the decimal part,\nso we can work with integers most of the time, converting them into doubles only in the final steps:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003estruct\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003eMeasurement\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ename\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eString\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emin\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emax\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eavg\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecount\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eAll the file needs to be loaded into a memory then. Instead of friction between file system and memory, we load it in a \u003ccode\u003eData\u003c/code\u003e to\nhave faster access to it. System may decide to store that in a swap, but it is anyway faster that reading from the file.\u003c/p\u003e\n\u003cp\u003eWe store results in a dictionary \u003ccode\u003eDictionary\u0026lt;Int, Measurement\u0026gt;\u003c/code\u003e. Key is not a \u003ccode\u003eString\u003c/code\u003e, because use of a city name as\na key is not efficient for two reasons. First, you need to parse the name into a string (or at least bytes buffer,\nyet still not effective). Second, default hashing is not so fast and will reiterate over the name more likely.\nTo solve that, we can compute hash while we parse the name. Then we won\u0026rsquo;t need to convert name to a string each time,\nbut only the first one. I only later realized that parsing bytes into a string could be done much later, but if you\nthink about it - there is not a lot of difference. We know that there are only 413 different stations, and more demanding\nversion has 10K of them, yet neither of this will make a significant change here due to reading it only one time.\u003c/p\u003e\n\u003cp\u003eFinally, it is important to allocate dictionary with capacity beforehand. We know how much stations are there, so we can\nuse it to benefit by reducing allocations as we add elements to a dictionary, so we allocate each dictionary with initial\ncapacity of 500 keys:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003evar\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eDictionary\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026lt;\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eMeasurement\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026gt;(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eminimumCapacity\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e500\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eYou can check the whole code (a bit messy though) at \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/6aa5c39a66746bb6eafea5f03e0d146a6b92062c\"\u003ethis commit\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eThat gives us running time around 2 minutes 45 seconds. Not bad for such simple thing we\u0026rsquo;ve done and reading file\nline by line.\u003c/p\u003e\n\u003ch1 id=\"cpu-and-chunks\"\u003eCPU and chunks\u003c/h1\u003e\n\u003cp\u003eWe are clearly not using the full potential of the modern computers with our implementation. For example, my MacBook Pro M1\nhas 10 cores, and we are using only 1 for processing 1 line at the time. Let\u0026rsquo;s change this.\u003c/p\u003e\n\u003cp\u003eTo parallelise processing effectively, we need to split a whole file into chunks, so they could be processed independetly.\nWe don\u0026rsquo;t know yet how many chunks will give us the best performace, so let\u0026rsquo;s make number of chunks a configuarable\nparameter and play with it later.\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkCount\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e10\u003c/span\u003e \u003cspan style=\"color:#75715e\"\u003e// setting to number of cores for start\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efileSize\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e/\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunksCount\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eCurrently, we are going to load 1.3GB of data for each chunk and process it on one of the cores. Now, the lines in the\nfile has different lengths and each chunk is more likely to end on an arbitrary position in the line, more likely to be\nsomewhere in the middle. But we need to have clear line boundaries, so using \u003ccode\u003echunkSize\u003c/code\u003e as starting point we are going\nto adjust it to be exactly till the end of line:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003evar\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003efunc\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003enextChar\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e-\u0026gt;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// read char from file handle and update offset\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003ewhile\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026lt;=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkStart\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// 10 is ASCII code for new line\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ewhile\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026lt;=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026amp;\u0026amp;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecurrChar\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e!=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e10\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003ecurrChar\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003enextChar\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// restore boundary if needed\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003eif\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026gt;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecurrChunkSize\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkStart\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eprocess\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003estart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkStart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003esize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecurrChunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// read char after new line\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003ecurrChar\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003enextChar\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eThe code above reads till the end of a line after we skipped to the end of a chunk, which we can consider to be extremely\nsmall amount of work - line is a city name (presumably around 40 characters max) and temperature (up to 5 chars in total),\nso in worst case we have to run inner while loop ~45 times, and with 10 chunks it is 450 for the worst case scenario. If\nwe will increase number of chunks significatly, e.g. to a few thousands, it will take \u003ccode\u003e45 * 2000 = 90_000\u003c/code\u003e iterations. That\nis a still a small amount of time (~0.05 seconds), which could be a subject for optimisation if nothing else left to optimise,\nbut we can consider this as irrelative, since in real case it is more likely to by around half of that time anyway.\u003c/p\u003e\n\u003cp\u003eTo run processing of chunks we are going to use new Swift Concurrency capabilities and task group. The implementation\nis already has taken care of not scheduling too much tasks avoiding thread explosion, still we have to be mindfull\nof chunks being not too small. The decompose it even further, we are going to introduce two actors: one for reading a chunk\nof data, second for parsing it into a dictiorary as partial result.\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003etypealias\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ePartialResult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e[\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eInt64\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eMeasurement\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e]\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003ewhile\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026lt;=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// ...\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eawait\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ewithTaskGroup\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eof\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ePartialResult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#00a8c8\"\u003eself\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e \u003cspan style=\"color:#111\"\u003egroup\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003ein\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003efor\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunk\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003ein\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunks\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003egroup\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eaddTask\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ereader\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eChunkReader\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003edata\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eawait\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ereader\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003erun\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003estart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkStart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003esize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecurrChunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eparser\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eLineParser\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epartialResult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eawait\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eparser\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003erun\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003edata\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003edata\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epartialResult\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003evar\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ePartialResult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eminimumCapacity\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e500\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003efor\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eawait\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epartial\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003ein\u003c/span\u003e \u003cspan style=\"color:#111\"\u003egroup\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003emerge\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003epartial\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003einto\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003eactor\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eChunkReader\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003eprivate\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eFileDescriptor\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003einit\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eFileDescriptor\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#00a8c8\"\u003eself\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003efunc\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003erun\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003estart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003esize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eInt\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e-\u0026gt;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eData\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#75715e\"\u003e// read raw data\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003eactor\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eLineParser\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003efunc\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003erun\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003edata\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eData\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e-\u0026gt;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ePartialResult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#75715e\"\u003e// parse lines from the loaded data\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eThis will allow us to utilise CPU at max by constantly running reading or parsing, without blocking. Actors there aren\u0026rsquo;t\nactually protecting any shared state, but act as isolation regions for running tasks.\u003c/p\u003e\n\u003cp\u003eWe also need to adjust our \u003ccode\u003eMeasurement\u003c/code\u003e structure to contain \u003ccode\u003eData\u003c/code\u003e for name instead of conversion to a string. We will\nonly convert it to a string when the result needs to be displayed.\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003estruct\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003eMeasurement\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ename\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eData\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// ...\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eAs a result, after some play with chunks number, by using 1024 chunks, the time has been reduced drastically to ~25 seconds.\nThe implementation is \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/ce32b71e98a1e94ea3e23a75c249152e3a93806b\"\u003eavailable here\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eThat is a good result so far, but we can do better, there is still a room for improvement.\u003c/p\u003e\n\u003ch1 id=\"immediate-scheduling-and-more-effective-file-reading\"\u003eImmediate scheduling and more effective file reading\u003c/h1\u003e\n\u003cp\u003eWe create group to run tasks after chunks has been collected. And chunk are collected sequentially, meanining we are\nloosing valuable time waiting for them all to be defined first. Instead, we are going to put scanning for chunks as\npart of a task group:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eawait\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ewithTaskGroup\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eof\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ePartialResult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#00a8c8\"\u003eself\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e \u003cspan style=\"color:#111\"\u003egroup\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003ein\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ewhile\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e\u0026lt;=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003emaxOffset\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#75715e\"\u003e// ...\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003egroup\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eaddTask\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#75715e\"\u003e// ...\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eThat isn\u0026rsquo;t going to get us a lot of speed up, but still we are using resources more consciously.\u003c/p\u003e\n\u003cp\u003eThe file reading can be improved as well. At the time we were using \u003ccode\u003eFileHandle\u003c/code\u003e abstraction to read from the file,\nbut it is a wrapper over C API, that adds overhead, and uses \u003ccode\u003eData\u003c/code\u003e type, which might be not as efficient as we expect it\nto be, since we actually just need an array of UInt8, which is returned by C API, so we can avoid needless conversions\nback and forth.\u003c/p\u003e\n\u003cp\u003eSo what we are going to do is replace \u003ccode\u003eFileHandle\u003c/code\u003e with \u003ccode\u003efopen\u003c/code\u003e:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efile\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efopen\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003epath\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#d88200\"\u003e\u0026#34;r\u0026#34;\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\u003cspan style=\"color:#f92672\"\u003e!\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eThen, read into a byte array:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUnsafeMutableRawPointer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eallocate\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebyteCount\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ealignment\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eMemoryLayout\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026lt;\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026gt;.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ealignment\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003efread\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efile\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003erawBytes\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebindMemory\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eto\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#00a8c8\"\u003eself\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecapacity\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebyteArray\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eArray\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eUnsafeBufferPointer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003estart\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003erawBytes\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ecount\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003echunkSize\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e))\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003edeallocate\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eSince we aren\u0026rsquo;t using \u003ccode\u003eData\u003c/code\u003e anywhere, \u003ccode\u003eMeasurement\u003c/code\u003e has to change:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003estruct\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003eMeasurement\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ename\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eArraySlice\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026lt;\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026gt;\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#75715e\"\u003e// ...\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eUse of \u003ccode\u003eArraySlice\u003c/code\u003e allows us avoid copy of a memory each time, which is clearly saves us a lot.\u003c/p\u003e\n\u003cp\u003eWith that changes taken into effect, the time has been further reduced to 10s. \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/becc8be2a7da9afe5b38503ad5c4c6d0635e4f65\"\u003eImplementation is here\u003c/a\u003e.\u003c/p\u003e\n\u003ch1 id=\"running-out-of-ideas\"\u003eRunning out of ideas\u003c/h1\u003e\n\u003cp\u003eAt this point I have almost gone out of improvements. A few tweaks has made runnig time to decrease to 9s:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eChange chunks agait to 2048 now, since we have faster processing, we can benefit from more chunks.\u003c/li\u003e\n\u003cli\u003eFix hashing for final result collection.\u003c/li\u003e\n\u003cli\u003eAdd inlining to some of the methods to be forced.\u003c/li\u003e\n\u003cli\u003eSimplify temperature parsing.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eLater, I was suspecting that hashing I was using is giving me collisions, so I\u0026rsquo;ve changed it to have FNV-1a algorithm\nimplementation, which we discuss later. That haven\u0026rsquo;t made any performance improvements.\u003c/p\u003e\n\u003cp\u003eState of the code at this point \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/3073a3fad8a08f6d9d34ca29e3678a583e8c90f3\"\u003ecan be found here\u003c/a\u003e.\u003c/p\u003e\n\u003ch1 id=\"file-reading-one-more-time\"\u003eFile reading: one more time\u003c/h1\u003e\n\u003cp\u003eIf we take a look at the reading of the file, we can notice that it doesn\u0026rsquo;t benefit a lot from concurrency, we still have\nexclusive access and shared state in form of a pointer. That is a bottleneck in our reading part. We also open file for\neach chunk we are processing, because file handle cannot be passed safely between concurrently running code. On the other\nhand, there is a file descriptor, which is a safe alternative to use for faster concurrent access.\u003c/p\u003e\n\u003cp\u003eWe are going to replace our file reading to use file descriptor:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eCreate a single file descriptor, so we open file only once, then share it among readers.\u003c/li\u003e\n\u003cli\u003eUse \u003ccode\u003epread\u003c/code\u003e API equivalent on \u003ccode\u003eFileDescriptor\u003c/code\u003e to read from file concurrently.\u003c/li\u003e\n\u003cli\u003eLimit number of readers to the number of cores.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eMost of the changes aren\u0026rsquo;t hard to implement, yet we need to address \u003ccode\u003efgetc\u003c/code\u003e we\u0026rsquo;ve been using before. Due to a stateful\nbehaviour, it have been advancing automatically for us. And now we are going to avoid modifying descriptor state. To\nhandle that, we create a replication of this function:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-swift\" data-lang=\"swift\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003efunc\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003egetc\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e-\u0026gt;\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUnsafeMutableRawBufferPointer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eallocate\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebyteCount\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ealignment\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eMemoryLayout\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026lt;\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e\u0026gt;.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ealignment\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003edefer\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e{\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003edeallocate\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e()\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003elet\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebytesRead\u003c/span\u003e \u003cspan style=\"color:#111\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#00a8c8\"\u003etry\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e!\u003c/span\u003e \u003cspan style=\"color:#111\"\u003efd\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eread\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003efromAbsoluteOffset\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003einto\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eoffset\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebytesRead\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ebuffer\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebindMemory\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003eto\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e:\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eUInt8\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e.\u003c/span\u003e\u003cspan style=\"color:#00a8c8\"\u003eself\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)[\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e]\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#111\"\u003e}\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eAnd replace calls to \u003ccode\u003efgetc()\u003c/code\u003e with \u003ccode\u003egetc()\u003c/code\u003e.\u003c/p\u003e\n\u003cp\u003eReading using descriptors given around 2s of improvement with the time down to ~7s to process 1 billion rows. With power\noff measurements in README this \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/52b0d8a7631f495a71dfe8d2047df55a33e8a345\"\u003eversion sits by the link\u003c/a\u003e.\u003c/p\u003e\n\u003ch1 id=\"10k-and-capacity\"\u003e10K and capacity\u003c/h1\u003e\n\u003cp\u003eThe challenge has another more demanding dataset, where it is a 10k different stations instead of 413 in the default. It\nwas interesting for me how well implementation will perform on this dataset, since I\u0026rsquo;ve made some assumtions on dictionary\ncapacity. Without capacity modifications, it takes around twice more to process - 12 seconds.\u003c/p\u003e\n\u003cp\u003eAt this point I have modified capacity to 11k (slightly more than set), and remembered quality of many of the containers -\nthey grow by doubling their underlying storage. So instead of setting to somewhat random number, I\u0026rsquo;d better use power of 2.\n14 is the least power greater than 10k, so here we go: \u003ccode\u003elet capacity = 1 \u0026lt;\u0026lt; 14\u003c/code\u003e. This little change has lead to a drasticall\nimprovement in running time reduced to 8.8s. We just sliced off 3 seconds just by using better capacity.\u003c/p\u003e\n\u003cp\u003eAt this point I have already had an assumption that despite setting initial capacity for dictionary to 500, it isn\u0026rsquo;t enough.\nSo I\u0026rsquo;ve tried to run 10k improvement on default dataset, and have got 0.5s improvement with running time dropped to 6.5s.\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eFinal implementation: \u003ca href=\"https://github.com/khlopko/1brc-swift/tree/161f7c9c5a9ea236edd6e4363a549d3b6be274e6\"\u003eclick here\u003c/a\u003e.\u003c/strong\u003e\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003ch1 id=\"ideas-for-further-improvement\"\u003eIdeas for further improvement\u003c/h1\u003e\n\u003cp\u003eAs for now, I have mostly gone out of ideas on how to improve it further. Reading is fast, the main bottlenecks are\nin parsing.\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\n\u003cp\u003eMore likely, I we could avoid conversion to [UInt8] and use pointer we advance over, we would be able to reduce time,\nsince there won\u0026rsquo;t be an overhead for array creation and access checks it performs on subscript. It happend to be more\ncomplex task that I thought and as for now it still an idea to check.\u003c/p\u003e\n\u003c/li\u003e\n\u003cli\u003e\n\u003cp\u003eThe second bottleneck is dictionary. Despite its implementation being effective, and we are using effective hashing paired\nwith thoughful memory allocations, it is still costs more than array access, plus we have to create new structure each time.\nHashing algorithm we are using are not producing collisions on our set of data, so we can modify it to act as array indexes\nand migrate from a dictionary to an array. I would expect it to give also huge time improvement, if these assumptions will work.\u003c/p\u003e\n\u003c/li\u003e\n\u003cli\u003e\n\u003cp\u003eTemperature has constraints with min and max values from -99.9 to 99.9, plus only one fraction digit. We already do benefit\nfrom 1 fraction digit by setting this implicitly in code, but the range is still has a room for improvement.\u003c/p\u003e\n\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eRest of possible improvements supposedly will include SIL generation analysis, looking at assembly code, use of SIMD, and so on.\nIt would be interesting to dive into that some day, but as for now there are still options to try before that.\u003c/p\u003e\n\u003ch1 id=\"final-thoughts\"\u003eFinal thoughts\u003c/h1\u003e\n\u003cp\u003eIt turned out to be complex, but not so much, task for me. 6.5 seconds on 1 billion rows seems to be a pretty good result.\nSuch tasks make you learn some internals of the language you are not bump into very often if such performance is not your\nmain concern.\u003c/p\u003e\n","url":"https://khlopko.com/posts/1brc-in-swift/","date_published":"21036-21-09T353:2121:00+01:00","date_modified":"21036-21-09T353:2121:00+01:00","author":{"name":"Kyrylo Khlopko","url":"https://khlopko.com/"}},{"id":"d7409195eed1ef943bd4e8c83fd9e887963f4e9d","title":"TDD: The minimum code to pass the test","summary":"","content_text":"This article opens a whole category about test-driven development (TDD). It will be covering questions that arise during this practice and observations\u0026hellip; like this one.\nWhat is this all about? The basic idea of TDD is to write tests first, then code. You write the simple code that fails, then make it pass and repeat.\nBut the \u0026ldquo;simplest\u0026rdquo; here tends to be confusing. What should be considered to match this parameter? The simplest implementation of the whole algorithm? Well, no.\nThe minimum code that fails The less code you write in your tests, the better. And this also applies to the simplicity part. You can consider that test fails if it does not compile or crash or (finally) assertion does not pass.\nThat means the simplest code that fails is one that does not compile. You try to instantiate an instance, but the type has not been defined yet - test failing. You try to call a method that does not exist yet - test failing.\nThe process of writing a test is that simple and that fast has iterations between writing test and real code: add parameter in the test – fail – update the real code.\nThe real code uses the same idea The \u0026ldquo;simples code\u0026rdquo; is also applied to the real code part. You write the minimum, the most straightforward code, that make your test green. And here, often, it gets confusing. Let me illustrate.\ndef test_pow_zero(): resut = pow(0, 2) assert(0, result) We expected 0*0=0 with the code above, which is pretty obvious. This is our first test for the pow function. Now we need to write the code to pass the test. What should it look like?\ndef pow(base, exp): return 0 If this is surprising for you, do not be upset. You need to hack your mind first to get comfortable with that idea. You do not need to make the entire solution for the first test to pass, \u0026ldquo;zero\u0026rdquo; will be just enough.\nThen, as you proceed, you will write the next test, for example, for 1*1=1:\ndef test_pow_one(): resut = pow(1, 2) assert(1, result) The test fails because we always return 0. Let\u0026rsquo;s modify:\ndef pow(base, exp): return x Both tests now passing, right? Finally, we cover one more case here for 10:\ndef test_pow_ten(): resut = pow(10, 2) assert(100, result) And the modification of the real code will be\ndef pow(base, exp): return x * x Everything is passing, and we\u0026rsquo;ve already covered 3 cases.\nYou might say \u0026ldquo;This is a too simple case, and the pow can be written without any tests!\u0026rdquo;. The key purpose is to illustrate the amount of real code we need to pass the test.\n","content_html":"\u003cp\u003eThis article opens a whole category about test-driven development (TDD). It will be covering questions that arise during this practice and observations\u0026hellip; like this one.\u003c/p\u003e\n\u003ch2 id=\"what-is-this-all-about\"\u003eWhat is this all about?\u003c/h2\u003e\n\u003cp\u003eThe basic idea of TDD is to write tests first, then code. You write the simple code that fails, then make it pass and repeat.\u003c/p\u003e\n\u003cp\u003eBut the \u0026ldquo;simplest\u0026rdquo; here tends to be confusing. What should be considered to match this parameter? The simplest implementation of the whole algorithm? Well, no.\u003c/p\u003e\n\u003ch2 id=\"the-minimum-code-that-fails\"\u003eThe minimum code that fails\u003c/h2\u003e\n\u003cp\u003eThe less code you write in your tests, the better. And this also applies to the simplicity part. You can consider that test fails if it does not compile or crash or (finally) assertion does not pass.\u003c/p\u003e\n\u003cp\u003eThat means the simplest code that fails is one that does not compile. You try to instantiate an instance, but the type has not been defined yet - test failing. You try to call a method that does not exist yet - test failing.\u003c/p\u003e\n\u003cp\u003eThe process of writing a test is that simple and that fast has iterations between writing test and real code: add parameter in the test – fail – update the real code.\u003c/p\u003e\n\u003ch2 id=\"the-real-code-uses-the-same-idea\"\u003eThe real code uses the same idea\u003c/h2\u003e\n\u003cp\u003eThe \u0026ldquo;simples code\u0026rdquo; is also applied to the real code part. You write the minimum, the most straightforward code, that make your test green. And here, often, it gets confusing. Let me illustrate.\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003etest_pow_zero\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e():\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eresut\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e2\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003eassert\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eWe expected \u003ccode\u003e0*0=0\u003c/code\u003e with the code above, which is pretty obvious. This is our first test for the \u003ccode\u003epow\u003c/code\u003e function. Now we need to write the code to pass the test. What should it look like?\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebase\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eexp\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e):\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eIf this is surprising for you, do not be upset. You need to hack your mind first to get comfortable with that idea. You do not need to make the entire solution for the first test to pass, \u0026ldquo;zero\u0026rdquo; will be just enough.\u003c/p\u003e\n\u003cp\u003eThen, as you proceed, you will write the next test, for example, for \u003ccode\u003e1*1=1\u003c/code\u003e:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003etest_pow_one\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e():\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eresut\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e2\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003eassert\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eThe test fails because we always return 0. Let\u0026rsquo;s modify:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebase\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eexp\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e):\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ex\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eBoth tests now passing, right? Finally, we cover one more case here for 10:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003etest_pow_ten\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e():\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#111\"\u003eresut\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#111\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e10\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e2\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003eassert\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e100\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eresult\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e)\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eAnd the modification of the real code will be\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#272822;background-color:#fafafa;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#00a8c8\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#75af00\"\u003epow\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e(\u003c/span\u003e\u003cspan style=\"color:#111\"\u003ebase\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e,\u003c/span\u003e \u003cspan style=\"color:#111\"\u003eexp\u003c/span\u003e\u003cspan style=\"color:#111\"\u003e):\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#00a8c8\"\u003ereturn\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ex\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003e*\u003c/span\u003e \u003cspan style=\"color:#111\"\u003ex\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eEverything is passing, and we\u0026rsquo;ve already covered 3 cases.\u003c/p\u003e\n\u003cp\u003eYou might say \u0026ldquo;This is a too simple case, and the \u003ccode\u003epow\u003c/code\u003e can be written without any tests!\u0026rdquo;.\nThe key purpose is to illustrate the amount of real code we need to pass the test.\u003c/p\u003e\n","url":"https://khlopko.com/posts/tdd-minimum-code-to-pass-the-test/","date_published":"26036-26-09T347:2626:00+01:00","date_modified":"26036-26-09T347:2626:00+01:00","author":{"name":"Kyrylo Khlopko","url":"https://khlopko.com/"}}]}