The Foreign Memory Access API is designed to replace the Unsafe and ByteBuffer APIs, addressing their limitations such as size constraints and limited access modes. It supports long offsets, enabling larger memory segments, and includes advanced access modes like volatile and compare-and-swap. It also introduces deterministic memory allocation, allowing developers to explicitly free memory when needed.
The Foreign Function API eliminates the need for an intermediate glue code layer written in C or C++. Instead, the entire intermediate layer can be written in pure Java, simplifying the process and reducing the need for additional build pipelines or learning new languages. This also improves performance due to better visibility and optimization by the JIT compiler.
The API went through several iterations, with significant redesigns of the programming model, particularly around lifetime management of memory resources. The introduction of 'arenas' allows for safer and more deterministic memory management. Additionally, support for more function shapes and platforms was added, and a fallback implementation based on libffi was introduced.
The API benefits from the 'Panama effect,' where the JIT compiler can optimize the glue code written in Java more effectively than the intermediate C/C++ code used in JNI. Additionally, upcalls (native code calling back into Java) are up to three times faster due to the efficient handling of method handles and machine code stubs.
Libraries like Lucene, Netty, and Tomcat have adopted the API, primarily to replace their usage of Unsafe. These libraries have observed performance parity with Unsafe while gaining safer memory management through the API's lifetime management features.
Developers can use the Jextract tool, which parses a native library's header file and automatically generates Java wrapper functions. This allows developers to call native library functions directly from Java without writing any C/C++ code.
The API is finalized in JDK 22 and no longer requires the --enable-preview flag. Future work includes improving the Jextract tool to generate more self-contained and editable code snippets for easier integration with Java projects.
Hello and welcome back to the Inside Java podcast, a podcast about everything Java brought to you from the team at Oracle who makes Java. My name is Ana Maria Michelchano and I recently had a conversation with Jorn Vernet about foreign function and memory API finalization in JDK 22. Project Panama and foreign memory access API were subject of episode 9 and 10 of the Inside Java podcast.
but they went through many updates since then. So Jorn and I decided to keep you posted on what led to finalizing foreign function and memory API in JDK 22, goals of these APIs, potential use cases in your daily developer life, and what comes next in Project Panama. Enjoy the show. Jorn, welcome. Can you please first introduce yourself?
Hello, I'm Jorn Verne. I've been working on the foreign function and memory access API for about five years now, and four and a half of that have been at Oracle. So during that time, I've mostly been in charge of the foreign function API, and I basically rewrote the entire implementation that I inherited and brought it to finalization. And at the same time, also
discussing and designing the public API of that as well as the foreign memory access API together with Maurizio. Wow, that sounds like a lot. Let's dive into our today's subject and I want to start by asking you what is foreign memory access API and what are its goals?
Yeah, so the foreign memory access API is mostly meant as a replacement for the unsafe and byte buffer APIs, which both have certain limitations. For instance, the byte buffer API only works based on int offsets. So that means that a byte buffer can only span a limited amount of memory.
because if it gets too large, you wouldn't be able to access it with int offsets. So the new API replaces that with long offsets, and that essentially allows you to encapsulate the entire virtual address space of a process in a single memory segment. So that allows you to have much larger memory segments. On top of that, the byte buffer API is limited in the access modes that it supports.
You can only do plain access. You can get multiple elements into an array or set multiple elements from an array. The memory access API is really built on top of var handles, which have support many different access modes, such as set volatile access, compare and exchange, compare and swap, which are really powerful concurrency primitives that more power users could take advantage of. Besides that,
There's also a subsection of the foreign memory access API is the memory layout API, which is an API that can be used to describe the layout of a memory segment or of a region of memory. So the idea is that you declare the layout of a memory segment and then you can from that declaration derive things such as offsets to certain elements, certain nested elements,
or derive var handles that can be used to access certain elements within the memory segment. And that avoids having to do manual offset computations, for instance. And then finally, we added...
deterministic the allocation in this new API, which means that you can now explicitly free memory in the new API. This is not something that was possible, for instance, with byte buffer. If you created a byte buffer, even if it was for native memory, so direct byte buffer,
the garbage collector would actually have to come in and free that memory for you. So you had no control over when the memory is freed. And that can be problematic, for instance, if you are memory mapping very large files and then on the next loop iteration you want to memory map the next file, for instance, if you're implementing a database.
You need to be sure that on the next iteration of the loop, when you want to map the next file, that you actually have address space available in your process. But you can't really rely on that if the garbage collector has to come in and clean up the memory. You need to be able to say as the developer, right now I want this memory to be freed because I want to memory map the next file before moving on.
Okay, so moving on to the foreign function API, which is the other aspect. This is really meant as a replacement for JNI that is more pure Java. So in JNI, you have to write, in order to access a native library you want to access, you have to write a separate intermediate glue code layer.
which is a native binary written in C or C++ that acts as an intermediate between your Java code and the native library you want to access. But for this new Form Function API, you can do the intermediate layer purely in Java. So that has advantages that you don't have to learn another programming language to do this. You can just do anything from Java.
But also, you don't have to set up a different part of your build pipeline to build this intermediate binary, and you don't have to figure out how to ship it. As long as your user has the native library you're trying to access installed on their system, you can access it directly with pure Java code.
Wow, all that sounds cool. Well, I've kept an eye on the evolution of this API, especially since my team maintains a hands-on lab on Panama that showcases foreign function and memory API. So I was a little bit familiar with what changed throughout each incubation and preview. But for the audience, can you please walk us through on what and why was changed between the first incubation and its current state?
Yeah, so incubation is really the experimental stage of the API.
So when we went into incubator, there were parts of the API that were just unfinished. For instance, some call shapes with native functions returning by value structs weren't implemented. So we worked on that. And then especially after we went to preview, we did a few iterations where we
rewrote and redesigned the entire programming model and really thought that out. I think at the start of Preview, it's fair to say that we had a bag of working features and it took, I think, three iterations of Preview being in Preview to really polish that into an API that stands on its own as a whole rather than just being a bag of separate features.
So one of the things that came out of that is that we looked at the programming model with regards to lifetime management of memory resources.
So typically in C, for instance, you can allocate memory with malloc and you can free it with free. And that means that every pointer has its own lifetime, really. Every memory chunk has its own lifetime. And because free is accessible for everyone, anybody can free that memory as well.
But what we ended up with is a model where we have what's called an arena that denotes an encompassing lifetime. And then anybody that has access to that arena can allocate memory in that lifetime. And then finally, when the arena is closed, which is only possible to do by the person that is the owner of the arena that has access to that object, can close and free all the memory that was allocated in it. And this is...
Also an important safety feature, because if you have multiple chunks of memory that reference each other, that will be safe to do if they share the same lifetime. That prevents situations where one of the chunks of memory is freed and then the other chunk of memory is still referencing it even though it has already gone away. So that is one of the things we polished.
And then for the foreign function API, as I said, we implemented some corner cases. Well, not corner cases. We implemented some function shapes returning by value structs.
And more recently, we've also added a fallback implementation based on the libffi library. And then there's been also many third party contributions of ports. So it's safe to say, I think now that we do support every platform with the foreign function API as well. So in that regard, we've come a long way since we started the incubator. But most of the visible changes were to the memory access API.
Wow, that's a lot of work and it's great to hear that every platform is supported. For many years, the Java developers, including myself, used GNI as the only way to invoke native code. How is the foreign function and memory API better? Yeah, so...
When you're using JNI to access a native library, you have to write this intermediate glue code library. So that is a library, a separate library written in C and C++. You have to learn a different programming language than Java to be able to write those libraries. You have to set up a separate part of your build pipeline to be able to build that library. And...
The new API avoids all that by replacing the programming model with a pure Java one. You can write all the interconnecting code between the Java code you're writing and the native library you're trying to invoke in pure Java. And that makes it...
easier to do because you don't have to set up this build pipeline. You don't have to necessarily learn a different programming language, but it also has several performance benefits such as something that we call the Panama effect, which is that because we bring more of this intermediate layer into the Java side,
it will be more visible to the JIT, right? We're writing pure Java code, it's visible to the JIT compiler. The JIT compiler can optimize that code together in the context of the surrounding Java code, where previously with JNI, you had an intermediate hop into the intermediate library and then another hop into the final library you're trying to invoke. And it's just not very transparent. Those boundaries are not transparent to the JIT compiler.
That sounds very good. I mean, not going through the extra hoops to invoke native code in Java is great from my point of view as a developer. Now, is there anything else that you can tell us about the performance impact of the API, like more than what you already shared with us?
Yes, so as I mentioned, we have the Panama effect where most of the glue code that is written in JNI is now on the Java side. So it's more transparent to the JIT compiler and can be optimized together with surrounding Java code. But some other things we spent a lot of time looking at with the help of Roland Westerlin from Red Hat, looking at loop optimizations for
loops that were using longs in the body or using longs as indexes because the new API uses longs as offset everywhere we need to make sure that worked well so we made sure that those are actually just as optimized as you would have with a normal int offsets you used to have with byte buffers
But then the most exciting performance benefit is probably around up calls, which is when native library has to call back into Java. We've managed to make that up to three times faster than with JNI. The way that works is we have the foreign linker and you can give that a method handle that points to some Java code and a foreign linker will then wrap that inside a kind of machine code stub
that is then exposed back as a C function pointer. So the C function pointer can be passed to native code, and then that native code can use the function pointer to call back into Java. And that's the part that's about 3x faster, three times faster. Wow, three times faster. That's a great improvement.
Now, even if Java features go through incubation and preview phases, there are situations when developers like to experiment with those. That's the purpose of the incubation and the preview, to experiment and give feedback mostly. Do you know any particular experiments where foreign function and memory API was introduced, helped, or promises to help significantly?
Yes, so there were, I'm really glad, I want to say upfront, I'm really glad we went into incubation that early.
Because it really, like we've had many big projects try it out. So some of the big names I can say at the top of my head are Lucene, Netty. We had Tomcat, I believe. There's another project called Apache Data Sketches. And there were several like more Greenfield projects set up specifically to try out the FFM API as well.
And that provided a lot of valuable feedback. So like I said, Lucene, Netty were some of the bigger ones. I believe Lucene and Tomcat and maybe even Netty, but I'm not sure, already actually shipped versions that use the FFm API internally.
But both of Netty and Lucene used the new FFm API to replace their usage of unsafe. I believe that is a replacement use case. So it's more about being able to match the performance of unsafe
than trying to beat it because Nsafe is very bare metal. So yeah, that's not something you just beat the performance off, but we had to be fast enough to be in par. And I think we succeeded at that. And then I think Lucene also found some, we managed to make the Lucene code
I believe, safer because we have this whole lifetime management API that they were able to use to more explicitly demarcate the lifetimes of when they were done using resources and at the freedom memory. So those are some examples. Wow, that's great to see that this API already helps projects and probably it will help even more once it's finalized in JDK 22.
Apart from what we talked so far, I'm interested to know if you can share with us any other use cases, like where us as developers, we should carefully use the foreign function and memory API, like anything that we should pay attention to when using the API or to look out or not misuse actually the API.
Yes, so if you have a project that is using unsafe or if you have a project that is using JNI or you are wanting to set up a project that
uses a native library, then the new FFm API is really for you. Especially if you're using unsafe. Unsafe is, as the name says, an unsafe API that we hope as JDK developers to eventually retire. And the FFm API, especially the memory access part, is intended to replace some of the functionality in unsafe.
Even though the name says Foreign Memory Access API, the API can also be used to access Java heap memory. So you can wrap a Java array in a memory segment, or you can wrap a Java and an NIO buffer in a memory segment and then do accesses to that memory using the new API as well. Great to hear about that.
Let's just say that if somebody has a native library and would like to integrate it within some Java code, how they would start doing that? How do you start integrating a native library in your Java code using probably the foreign function of memory API, but maybe some tools?
Yeah, yes. So the best place to start if you have a native library, so there are some requirements. The native library needs to have a C interface. Then you will be able to use it with the FM API. The best place to start is with the J extract tool, which is a standalone tool. So it's not included in the JDK. It's a standalone tool that we've developed and are still developing actually.
that can be used to parse a header file of a native library and automatically generate Java wrapper functions that can be used to call that native library. So really, all you need is download Jxtract. It's on GitHub at github.com/openjk/jxtract. There's instructions in the README there for how to get it, how to download it.
Really, all you need then is the native library header file, a single command line infocation, and you get the source code or class files needed to access the library. So the idea is that you run J extract and then include the classes or source files that it generates in your project, and then you can call it from there. And it's pretty much exactly as C. It generates a header file
a class essentially so if i have some library like foo and it will jstriq will generate a foo underscore h class file you can use import static foo underscore h dot star and you can it will import essentially all the functions and you can call them as if they were c like that
That sounds very interesting. I've been using Jxtract for some time, sometimes building it myself. That's something that you can do as well, or just download it and work with it. It depends on how urgently you want to develop something with the features within the JDK early access builds as well.
Thank you very much for all this good information, Jorn. Anything else that you would like to add about Project Panama? What makes you excited about it at the moment or also in the future, if you have anything to share with us or for us to also keep an eye and follow?
Yes, so as far as the Project Panama roadmap goes, as far as the function and memory access API goes, we have finalized the API in JDK 22. So it has been moved out of preview state to being a final fully supported feature. I shouldn't say fully supported because preview features are also fully supported.
but yeah so the the foreign function of memory api is finalized in jdk 22
you will be able to use it without the --enable-preview flag. There are some parts of the API that are restricted functionality, which essentially are functionality that might crash the VM if used incorrectly. Those are guarded with an --enable-native-access flag, but that mostly...
That's mostly for parts of the Form Function API. The Form Memory Access API is pretty much pure Java. You can use it if you don't care about access to native libraries. You can just use it like that. And then, yeah, so there's the finalization in JDK 22, and then we are still working on improving JXtract. We are actually kind of rethinking the model of classes that it generates.
And we want to make it so it generates more self-contained snippets that you can more easily edit if you want to or copy-paste to somewhere else if you want to. Yeah, so that's what we're working on right now.
Thank you, Jorn. Thank you for joining us today and sharing all these insights with us. Folks, we're now wrapping up. And if you want to know more about Panama, take a look at the contact aggregated on insight.java.com. You can find their talks and blogs from accomplished engineers working on Project Panama, Jet Café, SIPs, newscasts, and many more stuff, all aimed to help you successfully take advantage of the latest Java developments.
Once again, Jorn, thank you for joining me today. And our listeners, stay tuned on the Java YouTube channel to hear more good news from Java. JORN LINDEN: Thank you for having me.