ayende.com
Open in
urlscan Pro
52.32.173.150
Public Scan
Submitted URL: http://ayende.com/
Effective URL: https://ayende.com/blog/
Submission: On August 07 via manual from US — Scanned from DE
Effective URL: https://ayende.com/blog/
Submission: On August 07 via manual from US — Scanned from DE
Form analysis
1 forms found in the DOMGET /blog/search
<form action="/blog/search" method="get"> <input name="q" type="search" placeholder="search" class="form-control">
<input type="submit">
</form>
Text Content
AYENDE @ RAHIEN AYENDE @ RAHIEN Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database. Get in touch with me: oren@ravendb.net +972 52-548-6969 Posts: 7,447 | Comments: 50,919 Copyright ©️ Ayende Rahien 2004 — 2023 Privacy Policy Terms filter by tags archive stack view grid view * architecture (573) rss * bugs (438) rss * challanges (121) rss * community (360) rss * databases (457) rss * design (871) rss * development (593) rss * hibernating-practices (62) rss * miscellaneous (590) rss * performance (362) rss * programming (1041) rss * raven (1362) rss * ravendb.net (440) rss * reviews (182) rss * uberprof (41) rss * 2023 * August (4) * July (5) * June (15) * May (3) * April (11) * March (5) * February (5) * January (8) * 2022 * December (5) * November (7) * October (7) * September (9) * August (10) * July (15) * June (12) * May (9) * April (14) * March (15) * February (13) * January (16) * 2021 * December (23) * November (20) * October (16) * September (6) * August (16) * July (11) * June (16) * May (4) * April (10) * March (11) * February (15) * January (14) * 2020 * December (10) * November (13) * October (15) * September (6) * August (9) * July (9) * June (17) * May (15) * April (14) * March (21) * February (16) * January (13) * 2019 * December (17) * November (14) * October (16) * September (10) * August (8) * July (16) * June (11) * May (13) * April (18) * March (12) * February (19) * January (23) * 2018 * December (15) * November (14) * October (19) * September (18) * August (23) * July (20) * June (20) * May (23) * April (15) * March (23) * February (19) * January (23) * 2017 * December (21) * November (24) * October (22) * September (21) * August (23) * July (21) * June (24) * May (21) * April (21) * March (23) * February (20) * January (23) * 2016 * December (17) * November (18) * October (22) * September (18) * August (23) * July (22) * June (17) * May (24) * April (16) * March (16) * February (21) * January (21) * 2015 * December (5) * November (10) * October (9) * September (17) * August (20) * July (17) * June (4) * May (12) * April (9) * March (8) * February (25) * January (17) * 2014 * December (22) * November (19) * October (21) * September (37) * August (24) * July (23) * June (13) * May (19) * April (24) * March (23) * February (21) * January (24) * 2013 * December (23) * November (29) * October (27) * September (26) * August (24) * July (24) * June (23) * May (25) * April (26) * March (24) * February (24) * January (21) * 2012 * December (19) * November (22) * October (27) * September (24) * August (30) * July (23) * June (25) * May (23) * April (25) * March (25) * February (28) * January (24) * 2011 * December (17) * November (14) * October (24) * September (28) * August (27) * July (30) * June (19) * May (16) * April (30) * March (23) * February (11) * January (26) * 2010 * December (29) * November (28) * October (35) * September (33) * August (44) * July (17) * June (20) * May (53) * April (29) * March (35) * February (33) * January (36) * 2009 * December (37) * November (35) * October (53) * September (60) * August (66) * July (29) * June (24) * May (52) * April (63) * March (35) * February (53) * January (50) * 2008 * December (58) * November (65) * October (46) * September (48) * August (96) * July (87) * June (45) * May (51) * April (52) * March (70) * February (43) * January (49) * 2007 * December (100) * November (52) * October (109) * September (68) * August (80) * July (56) * June (150) * May (115) * April (73) * March (124) * February (102) * January (68) * 2006 * December (95) * November (53) * October (120) * September (57) * August (88) * July (54) * June (103) * May (89) * April (84) * March (143) * February (78) * January (64) * 2005 * December (70) * November (97) * October (91) * September (61) * August (74) * July (92) * June (100) * May (53) * April (42) * March (41) * February (84) * January (31) * 2004 * December (49) * November (26) * October (26) * September (6) * April (10) Aug 07 2023 STRUCT MEMORY LAYOUT OPTIMIZATIONS, PRACTICAL CONSIDERATIONS time to read 9 min | 1710 words -------------------------------------------------------------------------------- Tweet Share Share 1 comments Tags: * architecture * design * development * ravendb.net In my previous post I discussed how we could store the exact same information in several ways, leading to space savings of 66%! That leads to interesting questions with regards to actually making use of this technique in the real world. The reason I posted about this topic is that we just gained a very significant reduction in memory (and we obviously care about reducing resource usage). The question is whatever this is something that you want to do in general. Let’s look at that in detail. For this technique to be useful, you should be using structs in the first place. That is… not quite true, actually. Let’s take a look at the following declarations: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public class PersonClass { public int Id; public DateTime Birthday; public > ushort Kids; } public struct PersonStruct { public int Id; public DateTime > Birthday; public ushort Kids; } > > view raw StructVsClass.cs hosted with ❤ by GitHub We define the same shape twice. Once as a class and once as a structure. How does this looks in memory? > Typelayoutfor'PersonClass'Size:32bytes.Paddings:2bytes%12ofObjectHeader8bytesMethodTablePtr8bytes1619:Int32Id4bytes2021:UInt16Kids2bytes2223:padding2bytes2432:DateTimeBirthday8bytesemptyspaceTypelayoutfor'PersonStruct'Size:24bytes.Paddings:10bytes%41of03:Int32Id4bytes47:padding4bytes815:DateTimeBirthday8bytes1617:UInt16Kids2bytes1823:padding6bytesemptyspace Here you can fine some really interesting differences. The struct is smaller than the class, but the amount of wasted space is much higher in the struct. What is the reason for that? The class needs to carry 16 bytes of metadata. That is the object header and the pointer to the method table. You can read more about the topic here. So the memory overhead for a class is 16 bytes at a minimum. But look at the rest of it. You can see that the layout in memory of the fields is different in the class versus the structure. C# is free the re-order the fields to reduce the padding and get better memory utilization for classes, but I would need [StructLayout(LayoutKind.Auto)] to do the same for structures. The difference between the two options can be quite high, as you can imagine. Note that automatically laying out the fields in this manner means that you’re effectively declaring that the memory layout is an implementation detail. This means that you cannot persist it, send it to native code, etc. Basically, the internal layout may change at any time. Classes in C# are obviously not meant for you to poke into their internals, and LayoutKind.Auto comes with an explicit warning about its behavior. Interestingly enough, [StructLayout] will work on classes, you can use to force LayoutKind.Sequential on a class. That is by design, because you may need to pass a part of your class to unmanaged code, so you have the ability to control memory explicitly. (Did I mention that I love C#?) Going back to the original question, why would you want to go into this trouble? As we just saw, if you are using classes (which you are likely to default to), you already benefit from automatic layout of fields in memory. If you are using structs, you can enable LayoutKind.Auto to get the same behavior. This technique is for the 1% of the cases where that is not sufficient, when you can see that you memory usage is high and you can benefit greatly from manually doing something about it. That leads to the follow up question, if we go about implementing this, what are the overhead over time? If I want to add a new field to an optimized struct, I need to be able to understand how it is laid out in memory, etc. Like any optimization, you need to maintain that. Here is a recent example from RavenDB. In this case, we used to have an optimization that had a meaningful impact. The .NET code changed, the optimization now no longer make sense, so we reverted that to get even better perf. At those levels, you don’t get to rest on your laurels. You have to keep checking your assumptions. If you got to the point where you are manually optimizing memory layouts for better performance, there are two options: * You are doing that for fun, no meaningful impact on your system over time if this degrades. * There is an actual need for this, so you’ll need to invest the effort in regular maintenance. You can make that easier by adding tests to verify those assumptions. For example, verifying amount of padding in structs match expectation. A simple test that would verify the size of a struct would mean that any changes to that are explicit. You’ll need to modify the test as well, and presumably that is easier to catch / review / figure out than just adding a field and not noticing the impact. In short, this isn’t a generally applicable pattern. This is a technique that is meant to be applied in case of true need, where you’ll happily accept the additional maintenance overhead for the better performance and reduced resource usage. read more › Aug 04 2023 TECHNOLOGY & FRIENDS: OREN EINI ON BUILDING PROJECTS THAT ENDURE time to read 1 min | 22 words -------------------------------------------------------------------------------- Tweet Share Share 0 comments Tags: * ravendb.net * community read more › Aug 03 2023 STRUCT MEMORY LAYOUT AND MEMORY OPTIMIZATIONS time to read 35 min | 6841 words -------------------------------------------------------------------------------- Tweet Share Share 4 comments Tags: * ravendb.net * architecture * development * design Consider a warehouse that needs to keep track of items. For the purpose of discussion, we have quite a few fields that we need to keep track of. Here is how this looks like in code: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public struct WarehouseItem { public Dimensions? ProductDimensions; public > long? ExternalSku; public TimeSpan? ShelfLife; public float? AlcoholContent; > public DateTime? ProductionDate; public int? RgbColor; public bool? > IsHazardous; public float? Weight; public int? Quantity; public DateTime? > ArrivalDate; public bool? Fragile; public DateTime? LastStockCheckDate; public > struct Dimensions { public float Length; public float Width; public float > Height; } } > > view raw WarehouseItem.cs hosted with ❤ by GitHub And the actual Warehouse class looks like this: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public class Warehouse { private List<WarehouseItem> _items= new (); public > int Add(WarehouseItem item); public WarehouseItem Get(int itemId); } > > view raw Warehouse.cs hosted with ❤ by GitHub The idea is that this is simply a wrapper to the list of items. We use a struct to make sure that we have good locality, etc. The question is, what is the cost of this? Let’s say that we have a million items in the warehouse. That would be over 137MB of memory. In fact, a single struct instance is going to consume a total of 144 bytes. That is… a big struct, I have to admit. Using ObjectLayoutInspector I was able to get the details on what exactly is going on: > Type layout for 'WarehouseItem' > Size: 144 bytes. Paddings: 62 bytes (%43 of empty space) > 07:Int64ticks8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes015:Nullable`1ProductDimensions16bytes0:BooleanhasValue1byte13:padding3bytes415:Dimensionsvalue12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1631:Nullable`1ExternalSku16bytes0:BooleanhasValue1byte17:padding7bytes815:Int64value8bytes3247:Nullable`1ShelfLife16bytes0:BooleanhasValue1byte17:padding7bytes815:TimeSpanvalue8bytes4855:Nullable`1AlcoholContent8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes5671:Nullable`1ProductionDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes7279:Nullable`1RgbColor8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes8081:Nullable`1IsHazardous2bytes0:BooleanhasValue1byte1:Booleanvalue1byte8283:padding2bytes8491:Nullable`1Weight8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes9299:Nullable`1Quantity8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes100103:padding4bytes104119:Nullable`1ArrivalDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes120121:Nullable`1Fragile2bytes0:BooleanhasValue1byte1:Booleanvalue1byte122127:padding6bytes128143:Nullable`1LastStockCheckDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes As you can see, there is a huge amount of wasted space here. Most of which is because of the nullability. That injects an additional byte, and padding and layout issues really explode the size of the struct. Here is an alternative layout, which conveys the same information, much more compactly. The idea is that instead of having a full byte for each nullable field (with the impact on padding, etc), we’ll have a single bitmap for all nullable fields. Here is how this looks like: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public struct WarehouseItem { public Dimensions ProductDimensions; public bool > HasProductDimensions => (_nullability & (1 << 0)) != 0; public long > ExternalSku; public bool HasExternalSku => (_nullability & (1 << 1)) != 0; > public TimeSpan ShelfLife; public bool HasShelfLife => (_nullability & (1 << > 2)) != 0; public float AlcoholContent; public bool HasAlcoholContent => > (_nullability & (1 << 3)) != 0; public DateTime ProductionDate; public bool > HasProductionDate => (_nullability & (1 << 4)) != 0; public int RgbColor; > public bool HasRgbColor => (_nullability & (1 << 5)) != 0; public bool > IsHazardous; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; > public float Weight; public bool HasWeight => (_nullability & (1 << 7)) != 0; > public int Quantity; public bool HasQuantity => (_nullability & (1 << 8)) != > 0; public DateTime ArrivalDate; public bool HasArrivalDate => (_nullability & > (1 << 9)) != 0; public bool Fragile; public bool HasFragile => (_nullability & > (1 << 10)) != 0; public DateTime LastStockCheckDate; public bool > HasLastStockCheckDate => (_nullability & (1 << 11)) != 0; private ushort > _nullability; public struct Dimensions { public float Length; public float > Width; public float Height; } } > > view raw Smaller.cs hosted with ❤ by GitHub If we look deeper into this, we’ll see that this saved a lot, the struct size is now 96 bytes in size. It’s a massive space-savings, but… > Type layout for 'WarehouseItem' > Size: 96 bytes. Paddings: 24 bytes (%25 of empty space) We still have a lot of wasted space. This is because we haven’t organized the struct to eliminate padding. Let’s reorganize the structs fields to see what we can achieve. The only change I did was re-arrange the fields, and we have: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public struct WarehouseItem { public Dimensions ProductDimensions; public > float AlcoholContent; public long ExternalSku; public TimeSpan ShelfLife; > public DateTime ProductionDate; public DateTime ArrivalDate; public DateTime > LastStockCheckDate; public float Weight; public int Quantity; public int > RgbColor; public bool Fragile; public bool IsHazardous; private ushort > _nullability; public bool HasProductDimensions => (_nullability & (1 << 0)) != > 0; public bool HasExternalSku => (_nullability & (1 << 1)) != 0; public bool > HasShelfLife => (_nullability & (1 << 2)) != 0; public bool HasAlcoholContent > => (_nullability & (1 << 3)) != 0; public bool HasProductionDate => > (_nullability & (1 << 4)) != 0; public bool HasRgbColor => (_nullability & (1 > << 5)) != 0; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; > public bool HasWeight => (_nullability & (1 << 7)) != 0; public bool > HasQuantity => (_nullability & (1 << 8)) != 0; public bool HasArrivalDate => > (_nullability & (1 << 9)) != 0; public bool HasFragile => (_nullability & (1 > << 10)) != 0; public bool HasLastStockCheckDate => (_nullability & (1 << 11)) > != 0; public struct Dimensions { public float Length; public float Width; > public float Height; } } > > view raw Smallest.cs hosted with ❤ by GitHub And the struct layout is now: > Typelayoutfor'WarehouseItem'Size:72bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleAlcoholContent4bytes1623:Int64ExternalSku8bytes2431:TimeSpanShelfLife8bytes3239:DateTimeProductionDate8bytes4047:DateTimeArrivalDate8bytes4855:DateTimeLastStockCheckDate8bytes5659:SingleWeight4bytes6063:Int32Quantity4bytes6467:Int32RgbColor4bytes68:BooleanFragile1byte69:BooleanIsHazardous1byte7071:UInt16nullability2bytes We have no wasted space, and we are 50% of the previous size. We can actually do better, note that Fragile and IsHazarous are Booleans, and we have some free bits on _nullability that we can repurpose. For that matter, RgbColor only needs 24 bits, not 32. Do we need alcohol content to be a float, or can we use a byte? If that is the case, can we shove both of them together into the same 4 bytes? For dates, can we use DateOnly instead of DateTime? What about ShelfLife, can we measure that in hours and use a short for that (giving us a maximum of 7 years)? After all of that, we end up with the following structure: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > public struct WarehouseItem { public Dimensions ProductDimensions; public > float Weight; public long ExternalSku; public DateOnly ProductionDate; public > DateOnly ArrivalDate; public DateOnly LastStockCheckDate; public int Quantity; > private int _rgbColorAndAlcoholContentBacking; private ushort _nullability; > public ushort ShelfLifeInHours; public float AlcoholContent => > (float)(byte)_rgbColorAndAlcoholContentBacking; public int RgbColor => > _rgbColorAndAlcoholContentBacking >> 8; public bool Fragile => (_nullability & > (1 << 12)) != 0; public bool IsHazardous => (_nullability & (1 << 13)) != 0; > public bool HasProductDimensions => (_nullability & (1 << 0)) != 0; public > bool HasExternalSku => (_nullability & (1 << 1)) != 0; public bool > HasShelfLife => (_nullability & (1 << 2)) != 0; public bool HasAlcoholContent > => (_nullability & (1 << 3)) != 0; public bool HasProductionDate => > (_nullability & (1 << 4)) != 0; public bool HasRgbColor => (_nullability & (1 > << 5)) != 0; public bool HasIsHazardous => (_nullability & (1 << 6)) != 0; > public bool HasWeight => (_nullability & (1 << 7)) != 0; public bool > HasQuantity => (_nullability & (1 << 8)) != 0; public bool HasArrivalDate => > (_nullability & (1 << 9)) != 0; public bool HasFragile => (_nullability & (1 > << 10)) != 0; public bool HasLastStockCheckDate => (_nullability & (1 << 11)) > != 0; public struct Dimensions { public float Length; public float Width; > public float Height; } } > > view raw Packed.cs hosted with ❤ by GitHub And with the following layout: > 03:Int32dayNumber4bytes03:Int32dayNumber4bytes03:Int32dayNumber4bytesTypelayoutfor'WarehouseItem'Size:48bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleWeight4bytes1623:Int64ExternalSku8bytes2427:DateOnlyProductionDate4bytes2831:DateOnlyArrivalDate4bytes3235:DateOnlyLastStockCheckDate4bytes3639:Int32Quantity4bytes4043:Int32rgbColorAndAlcoholContentBacking4bytes4445:UInt16nullability2bytes4647:UInt16ShelfLifeInHours2bytes In other words, we are now packing everything into 48 bytes, which means that we are one-third of the initial cost. Still representing the same data. Our previous Warehouse class? It used to take 137MB for a million items, it would now take 45.7 MB only. In RavenDB’s case, we had the following: That is the backing store of the dictionary, and as you can see, it isn’t a nice one. Using similar techniques we are able to massively reduce the amount of storage that is required to process indexing. Here is what this same scenario looks like now: But we aren’t done yet , there is still more that we can do. read more › Jul 24 2023 PRODUCTION POSTMORTEMTHE DOG ATE MY REQUEST time to read 3 min | 417 words -------------------------------------------------------------------------------- Tweet Share Share 2 comments Tags: * databases * challanges * architecture * raven * ravendb.net A customer called us, quite upset, because their RavenDB cluster was failing every few minutes. That was weird, because they were running on our cloud offering, so we had full access to the metrics, and we saw absolutely no problem on our end. During the call, it turned out that every now and then, but almost always immediately after a new deployment, RavenDB would fail some requests. On a fairly consistent basis, we could see two failures and a retry that was finally successful. Okay, so at least there is no user visible impact, but this was still super strange to see. On the backend, we couldn’t see any reason why we would get those sort of errors. Looking at the failure stack, we narrowed things down to an async operation that was invoked via DataDog. Our suspicions were focused on this being an error in the async machinery customization that DataDog uses for adding non-invasive monitoring. We created a custom build for the user that they could test and waited to get the results from their environment. Trying to reproduce this locally using DataDog integration didn’t raise any flags. The good thing was that we did find a smoking gun, a violation of the natural order and invariant breaking behavior. The not so good news was that it was in our own code. At least that meant that we could fix this. Let’s see if I can explain what is going on. The customer was using a custom configuration: FastestNode. This is used to find the nearest / least loaded node in the cluster and operate from it. How does RavenDB know which is the fastest node? That is kind of hard to answer, after all. It checks. Every now and then, RavenDB replicates a read request to all nodes in the cluster. Something like this: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > Task<Node> FindFastest(Request req) { using var cts = new > CancellationTokenSource(); var tasks = new List<Task>(); foreach(var node in > cluster.Nodes) { tasks.Add(req.Execute(node, cts.Token)); } var first = await > Task.WhenAny(tasks); var idx = tasks.IndexOf(first); return > cluster.Nodes[idx]; } > > view raw FastestNode.cs hosted with ❤ by GitHub The idea is that we send the request to all the nodes, and wait for the first one to arrive. Since this is the same request, all servers will do the same amount of work, and we’ll find the fastest node from our perspective. Did you notice the cancellation token in there? When we return from this function, we cancel the existing requests. Here is what this looks like from the monitoring perspective: This looks exactly like every few minutes, we have a couple of failures (and failover) in the system and was quite confusing until we figured out exactly what was going on. read more › Jul 21 2023 PODCASTHANSLEMINUTES - ALL THE PERFORMANCE WITH RAVENDB'S OREN EINI time to read 1 min | 110 words -------------------------------------------------------------------------------- Tweet Share Share 0 comments Tags: * community * ravendb.net * raven I had a great time talking with Scott Hanselman about how we achieve great performance for RavenDB with .NET. You can listen to the podcast here, as usual, I would love your feedback. > In this episode, we talk to Oren Eini from RavenDB. RavenDB is a NoSQL > document database that offers high performance, scalability, and security. > Oren shares his insights on why performance is not just a feature, but a > service that developers and customers expect and demand. He also explains how > RavenDB achieves fast and reliable data access, how it handles complex queries > and distributed transactions, and how it leverages the cloud to optimize > resource utilization and cost efficiency! read more › Jul 05 2023 SOLVING HEAP CORRUPTION ERRORS IN MANAGED APPLICATIONS time to read 2 min | 369 words -------------------------------------------------------------------------------- Tweet Share Share 3 comments Tags: * challenges * development * design * bugs RavenDB is a .NET application, written in C#. It also has a non trivial amount of unmanaged memory usage. We absolutely need that to get the proper level of performance that we require. With managing memory manually, there is also the possibility that we’ll mess it up. We run into one such case, when running our full test suite (over 10,000 tests) we would get random crashes due to heap corruption. Those issues are nasty, because there is a big separation between the root cause and the actual problem manifesting. I recently learned that you can use the gflags tool on .NET executables. We were able to narrow the problem to a single scenario, but we still had no idea where the problem really occurred. So I installed the Debugging Tools for Windows and then executed: > &"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\gflags.exe" /p /enable C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe What this does is enable a special debug heap at the executable level, which applies to all operations (managed and native memory alike). With that enabled, I ran the scenario in question: > PS C:\Work\ravendb-6.0\test\Tryouts> > C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe > 42896 > Starting to run 0 > Max number of concurrent tests is: 16 > Ignore request for setting processor affinity. Requested cores: 3. Number of > cores on the machine: 32. > To attach debugger to test process (x64), use proc-id: 42896. Url > http://127.0.0.1:51595 > Ignore request for setting processor affinity. Requested cores: 3. Number of > cores on the machine: 32. License limits: A: 3/32. Total utilized cores: 3. > Max licensed cores: 1024 > http://127.0.0.1:51595/studio/index.html#databases/documents?&database=Should_correctly_reduce_after_updating_all_documents_1&withStop=true&disableAnalytics=true > Fatal error. System.AccessViolationException: Attempted to read or write > protected memory. This is often an indication that other memory is corrupt. > at Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, > System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, > PublicKeyToken=7cec85d7bea7798e]].Encode(System.ReadOnlySpan`1<Byte>, > System.Span`1<Byte>) > at > Sparrow.Server.Compression.HopeEncoder`1[[Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, > System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, > PublicKeyToken=7cec85d7bea7798e]], Sparrow.Server, Version=6.0.0.0, > Culture=neutral, > PublicKeyToken=37f41c7f99471593]].Encode(System.ReadOnlySpan`1<Byte> ByRef, > System.Span`1<Byte> ByRef) > at > Voron.Data.CompactTrees.PersistentDictionary.ReplaceIfBetter[[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, > Raven.Server, Version=6.0.0.0, Culture=neutral, > PublicKeyToken=37f41c7f99471593],[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, > Raven.Server, Version=6.0.0.0, Culture=neutral, > PublicKeyToken=37f41c7f99471593]](Voron.Impl.LowLevelTransaction, > Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, > Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, > Voron.Data.CompactTrees.PersistentDictionary) > at > Raven.Server.Documents.Indexes.Persistence.Corax.CoraxIndexPersistence.Initialize(Voron.StorageEnvironment) That pinpointed things so I was able to know exactly where we are messing up. I was also able to reproduce the behavior on the debugger: This saved me hours or days of trying to figure out where the problem actually is. read more › Jul 04 2023 CAFÉ DEBUG - INTERVIEW WITH OREN EINI CEO OF RAVENDB time to read 1 min | 14 words -------------------------------------------------------------------------------- Tweet Share Share 0 comments Tags: * architecture * community * ravendb.net read more › Jul 03 2023 PRODUCTION POSTMORTEMENOMEM WHEN TRYING TO FREE MEMORY time to read 3 min | 541 words -------------------------------------------------------------------------------- Tweet Share Share 0 comments Tags: * architecture * design * databases * development * ravendb.net * raven We got a support call from a client, in the early hours of the morning, they were getting out-of-memory errors from their database and were understandably perturbed by that. They are running on a cloud system, so the first inclination of the admin when seeing the problem was deploying the server on a bigger instance, to at least get things running while they investigate. Doubling and then quadrupling the amount of memory that the system has had no impact. A few minutes after the system booted, it would raise an error about running out of memory. Except that it wasn’t actually running out of memory. A scenario like that, when we give more memory to the system and still have out-of-memory errors can indicate a leak or unbounded process of some kind. That wasn’t the case here. In all system configurations (including the original one), there was plenty of additional memory in the system. Something else was going on. When our support engineer looked at the actual details of the problem, it was quite puzzling. It looked something like this: > System.OutOfMemoryException: ENOMEM on Failed to munmap at Sparrow.Server.Platform.Posix.Syscall.munmap(IntPtr start, UIntPtr length); That error made absolutely no sense, as you can imagine. We are trying to release memory, not allocate it. Common sense says that you can’t really fail when you are freeing memory. After all, how can you run out of memory? I’m trying to give you some, damn it! It turns out that this model is too simplistic. You can actually run out of memory when trying to release it. The issue is that it isn’t you that is running out of memory, but the kernel. Here we are talking specifically about the Linux kernel, and how it works. Obviously a very important aspect of the job of the kernel is managing the system memory, and to do that, the kernel itself needs memory. For managing the system memory, the kernel uses something called VMA (virtual memory area). Each VMA has its own permissions and attributes. In general, you never need to be aware of this detail. However, there are certain pathological cases, where you need to set up different permissions and behaviors on a lot of memory areas. In the case we ran into, RavenDB was using an encrypted database. When running on an encrypted database, RavenDB ensures that all plain text data is written to memory that is locked (cannot be stored on disk / swapped out). A side effect of that is that this means that for every piece of memory that we lock, the kernel needs to create its own VMA. Since each of them is operated on independently of the others. The kernel is using VMAs to manage its own map of the memory. and eventually, the number of the items in the map exceeds the configured value. In this case, the munmap call released a portion of the memory back, which means that the kernel needs to split the VMA to separate pieces. But the number of items is limited, this is controlled by the vm.max_map_count value. The default is typically 65530, but database systems often require a lot more of those. The default value is conservative, mind. Adjusting the configuration would alleviate this problem, since that will give us sufficient space to operate normally. read more › Jun 30 2023 RAVENDB DOCKER IMAGE UPDATES FOR THE V6.0 RELEASE time to read 1 min | 195 words -------------------------------------------------------------------------------- Tweet Share Share 0 comments Tags: * raven * ravendb.net We are going to be making some changes to our RavenDB 6.0 docker image, you can already access them using the nightly builds: > docker pull ravendb/ravendb-nightly:6.0-ubuntu-latest The purpose of those changes is to make it easier to run RavenDB in your environment and to make it a more natural fit to a Linux system. The primary reason we made this change is that we wanted to enable running RavenDB containers as non-root users. Most of the other changes are internal, primarily about file paths and how we are actually installing RavenDB on the container instance. We now share a single installation process across all Linux systems, which makes is easier to support and manage. This does have an impact on users' migration from RavenDB 5.4 docker images, but the initialization process should migrate them seamlessly. Note that if you have an existing 5.4 docker instance and you want to update that to 6.0 and run as non-root, you may need to explicitly grant permissions to the RavenDB data folder for the RavenDB user (uid: 999). As usual, I would like to invite you to take the system for a spin. We would really love your feedback. read more › Jun 26 2023 GENERATING SEQUENTIAL NUMBERS IN A DISTRIBUTED MANNER time to read 4 min | 631 words -------------------------------------------------------------------------------- Tweet Share Share 8 comments Tags: * architecture * design * development * databases * raven * ravendb.net On its face, we have a simple requirement: * Generate sequential numbers * Ensure that there can be no gaps * Do that in a distributed manner Generating the next number in the sequence is literally as simple as ++, so surely that is a trivial task, right? The problem is with the second requirement. The need to ensure that there are no gaps comes often when dealing with things like invoices. The tax authorities are really keen on “show me all your invoices”, and if there are gaps in the numbers, you may have to provide Good Answers. You may think that the third one, running in a distributed environment, is the tough challenge, but that isn’t actually the case. If we are running in a single location, that is fairly easy. Run the invoice id generation as a transaction, and you are done. But the normal methods of doing that are usually wrong in edge cases. Let’s assume that we use an Oracle database, which uses the following mechanism to generate the new invoice id: > invoice_seq.NEXTVAL Or SQL Server with an identity column: > CREATE TABLE invoices ( invoice_id INT IDENTITY(1,1) PRIMARY KEY, ... ); In both cases, we may insert a new value to the invoices table, consuming an invoice id. At some later point in time, we may roll back the transaction. Care to guess what happens then? You have INVOICE #1000 and INVOICE #1002, but nothing in between. In fact, no way to even tell what happened, usually. In other words, identity, sequence, serial, or autonumber – regardless of what database platform you use, are not suitable for generating gapless numbers. The reasoning is quite simple. Assume that you have two concurrent transactions, which generate two new invoices at roughly the same time. You commit the later one before the first one, and roll back the first. You now have: * Invoice #1 * Invoice #2 * … * Invoice #1000 * Invoice #1002 However, you don’t have Invoice #1001, and you cannot roll back the sequence value there, because if you do so, it will re-generate the #1002 on the next call. Instead, for gapless numbers, we need to create this as a dedicated part of the transaction. So there would be a record in our system that contains the NextInvoiceId, which will be incremented as part of the new invoice creation. In order to ensure that there are no gaps, you need to ensure that the NextInvoideId record increment is handled as a user operation, not a database operation. In other words, in SQL Server, that is a row in a table, that you manually increment as part of adding a new invoice. Here is what this will look like: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > CREATE PROCEDURE CreateNewInvoice @customer_email VARCHAR(50), -- Other > invoice parameters... AS BEGIN DECLARE @next_id INT; UPDATE next_gapless_ids > WHERE owner = 'invoices' SET @next_id = invoice_id = invoice_id + 1; -- Insert > the new invoice with the generated ID INSERT INTO invoices (invoice_id, > customer_email, ...) VALUES (@next_id, @customer_email, ...); END > > view raw new_invoice.sql hosted with ❤ by GitHub As you can see, we increment the row directly. So it will be rolledback as well. The downside here is that we can no longer create two invoices concurrently. The second transaction would have to wait on the lock on the row in the next_gapless_ids table. All of that happens inside a single database server. What happens when we are running in a distributed environment? The answer in this case, is the exact same thing. You need a transaction, a distributed one, using a consensus algorithm. Here is how you can achieve this using RavenDB’s cluster wide transactions, which use the Raft protocol behind the scenes: > This file contains bidirectional Unicode text that may be interpreted or > compiled differently than what appears below. To review, open the file in an > editor that reveals hidden Unicode characters. Learn more about bidirectional > Unicode characters > Show hidden characters > > while (true) { using (var session = store.OpenSession(new SessionOptions { > TransactionMode = TransactionMode.ClusterWide })) { var gaplessId = > session.Load<GapLessId>("gapless/invoices"); var nextId = gaplessId.Value++; > var invoice = new Invoice { InvoiceId = nextId, // other properties }; > session.Store(invoice, "invoices/" + nextId); try { session.SaveChanges(); > break; } catch (ConcurrencyException) { continue; // re-try } } } > > view raw GaplessRavenDB.cs hosted with ❤ by GitHub The idea is simple, we have a transaction that modifies the gapless ids document and creates a new invoice at the same time. We have to handle a concurrency exception if two transactions try to create a new invoice at the same time (because they both want to use the same invoice id value), but in essence this is pretty much exactly the same behavior as when you are running on a single node. In other words, to ensure the right behavior, you need to use a transaction. And if you need a distributed transaction, that is just a flag away with RavenDB. read more › -------------------------------------------------------------------------------- * First * Previous * 1 * 2 * 3 * 4 * 5 * Next * Last FUTURE POSTS 1. QCon San Francisco Workshop: Building a database from the group up - one day from now There are posts all the way to Aug 08, 2023 RECENT SERIES 1. Production postmortem (50): 24 Jul 2023 - The dog ate my request 2. Podcast (4): 21 Jul 2023 - Hansleminutes - All the Performance with RavenDB's Oren Eini 3. Integer compression (11): 21 Jun 2023 - FastPFor in C#, results 4. Talk (7): 09 Jun 2023 - Scalable Architecture From the Ground Up 5. Fight for every byte it takes (6): 01 May 2023 - Decoding the entries View all series RECENT COMMENTS * Ahh, another ancient technique from the past ages. Are we even supposed to touch bare memory? And who uses these bit operato... By Rafal on Struct memory layout optimizations, practical considerations * Looking forward to the next post. An obvious issue with the current changes is that to maintain it, a developer needs to m... By Bart on Struct memory layout and memory optimizations * Would `[StructLayout(LayoutKind.Auto)]` work as well as rearranging the fields did? (Obviously it won't help with the other o... By Svick on Struct memory layout and memory optimizations * Peter, Thanks, fixed now. By Oren Eini on Struct memory layout and memory optimizations * something seems to have gone wrong with the styling here, everything extends off-screen to the right. By peter on Struct memory layout and memory optimizations SYNDICATION Main feed Comments feed }