We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode “The Panama Effect” with Jorn Vernee

“The Panama Effect” with Jorn Vernee

2024/1/8
logo of podcast Inside Java

Inside Java

AI Deep Dive AI Insights AI Chapters Transcript
People
J
Jorn Vernee
Topics
Jorn Vernee: 本次访谈主要介绍了JDK 22中最终确定的外部函数和内存API。外部内存访问API旨在替代Unsafe和ByteBuffer API,它使用long型偏移量支持更大的内存段,并提供更多访问模式和内存布局描述功能,还支持确定性内存分配和释放,解决了ByteBuffer API在内存大小和访问模式上的限制。外部函数API旨在替代JNI,它允许开发者使用纯Java代码访问本地库,无需编写中间层代码,简化开发流程并提高性能,避免了学习其他编程语言和设置额外构建流程的麻烦。在孵化和预览阶段,API经历了多次迭代,主要改进包括完善API功能、改进编程模型(引入arena机制管理内存生命周期)、实现更多函数调用形式以及添加基于libffi库的回退实现,最终支持所有平台。与JNI相比,外部函数和内存API具有性能优势,被称为“Panama效应”。由于中间层代码迁移到Java端,JIT编译器可以更好地优化代码,提高性能。在性能方面还有其他改进,例如针对使用long型变量的循环进行了优化,以及将本地库回调Java代码的性能提升了三倍。多个大型项目(例如Lucene、Netty、Tomcat)已经采用了外部函数和内存API,并取得了性能提升和代码安全性的改进。如果项目使用Unsafe或JNI,或者需要访问本地库,那么外部函数和内存API是理想的选择。该API也可以用于访问Java堆内存。可以使用Jextract工具来简化将本地库集成到Java代码的过程,该工具可以自动生成Java包装器函数。外部函数和内存API已在JDK 22中最终确定,并对Jextract工具进行持续改进。 Ana Maria Michelchano: 作为主持人,Ana Maria Michelchano引导访谈,提出问题,并对Jorn Vernee的回答进行总结和补充。她对Project Panama的演进和外部函数及内存API的应用场景和性能提升表现出浓厚的兴趣,并积极与Jorn Vernee进行互动,确保听众能够清晰地理解这些新技术的细节和优势。

Deep Dive

Key Insights

What is the Foreign Memory Access API and what are its main goals?

The Foreign Memory Access API is designed to replace the Unsafe and ByteBuffer APIs, addressing their limitations such as size constraints and limited access modes. It supports long offsets, enabling larger memory segments, and includes advanced access modes like volatile and compare-and-swap. It also introduces deterministic memory allocation, allowing developers to explicitly free memory when needed.

How does the Foreign Function API improve upon JNI?

The Foreign Function API eliminates the need for an intermediate glue code layer written in C or C++. Instead, the entire intermediate layer can be written in pure Java, simplifying the process and reducing the need for additional build pipelines or learning new languages. This also improves performance due to better visibility and optimization by the JIT compiler.

What significant changes were made to the Foreign Function and Memory API during its incubation and preview phases?

The API went through several iterations, with significant redesigns of the programming model, particularly around lifetime management of memory resources. The introduction of 'arenas' allows for safer and more deterministic memory management. Additionally, support for more function shapes and platforms was added, and a fallback implementation based on libffi was introduced.

What performance improvements does the Foreign Function and Memory API offer over JNI?

The API benefits from the 'Panama effect,' where the JIT compiler can optimize the glue code written in Java more effectively than the intermediate C/C++ code used in JNI. Additionally, upcalls (native code calling back into Java) are up to three times faster due to the efficient handling of method handles and machine code stubs.

Which Java libraries have already adopted the Foreign Function and Memory API, and what benefits have they observed?

Libraries like Lucene, Netty, and Tomcat have adopted the API, primarily to replace their usage of Unsafe. These libraries have observed performance parity with Unsafe while gaining safer memory management through the API's lifetime management features.

How can developers start integrating a native library into their Java code using the Foreign Function and Memory API?

Developers can use the Jextract tool, which parses a native library's header file and automatically generates Java wrapper functions. This allows developers to call native library functions directly from Java without writing any C/C++ code.

What is the current status of the Foreign Function and Memory API, and what future improvements are planned?

The API is finalized in JDK 22 and no longer requires the --enable-preview flag. Future work includes improving the Jextract tool to generate more self-contained and editable code snippets for easier integration with Java projects.

Chapters
The Foreign Memory Access API replaces the unsafe and byte buffer APIs, offering improvements like long offsets for larger memory segments, various access modes via var handles, and deterministic allocation for explicit memory freeing. This addresses limitations of previous APIs in memory size, access modes, and memory management control.
  • Replaces unsafe and byte buffer APIs
  • Uses long offsets for larger memory segments
  • Supports various access modes (set volatile, compare and exchange, etc.)
  • Offers deterministic allocation and explicit memory freeing

Shownotes Transcript

Translations:
中文

Hello and welcome back to the Inside Java podcast, a podcast about everything Java brought to you from the team at Oracle who makes Java. My name is Ana Maria Michelchano and I recently had a conversation with Jorn Vernet about foreign function and memory API finalization in JDK 22. Project Panama and foreign memory access API were subject of episode 9 and 10 of the Inside Java podcast.

but they went through many updates since then. So Jorn and I decided to keep you posted on what led to finalizing foreign function and memory API in JDK 22, goals of these APIs, potential use cases in your daily developer life, and what comes next in Project Panama. Enjoy the show. Jorn, welcome. Can you please first introduce yourself?

Hello, I'm Jorn Verne. I've been working on the foreign function and memory access API for about five years now, and four and a half of that have been at Oracle. So during that time, I've mostly been in charge of the foreign function API, and I basically rewrote the entire implementation that I inherited and brought it to finalization. And at the same time, also

discussing and designing the public API of that as well as the foreign memory access API together with Maurizio. Wow, that sounds like a lot. Let's dive into our today's subject and I want to start by asking you what is foreign memory access API and what are its goals?

Yeah, so the foreign memory access API is mostly meant as a replacement for the unsafe and byte buffer APIs, which both have certain limitations. For instance, the byte buffer API only works based on int offsets. So that means that a byte buffer can only span a limited amount of memory.

because if it gets too large, you wouldn't be able to access it with int offsets. So the new API replaces that with long offsets, and that essentially allows you to encapsulate the entire virtual address space of a process in a single memory segment. So that allows you to have much larger memory segments. On top of that, the byte buffer API is limited in the access modes that it supports.

You can only do plain access. You can get multiple elements into an array or set multiple elements from an array. The memory access API is really built on top of var handles, which have support many different access modes, such as set volatile access, compare and exchange, compare and swap, which are really powerful concurrency primitives that more power users could take advantage of. Besides that,

There's also a subsection of the foreign memory access API is the memory layout API, which is an API that can be used to describe the layout of a memory segment or of a region of memory. So the idea is that you declare the layout of a memory segment and then you can from that declaration derive things such as offsets to certain elements, certain nested elements,

or derive var handles that can be used to access certain elements within the memory segment. And that avoids having to do manual offset computations, for instance. And then finally, we added...

deterministic the allocation in this new API, which means that you can now explicitly free memory in the new API. This is not something that was possible, for instance, with byte buffer. If you created a byte buffer, even if it was for native memory, so direct byte buffer,

the garbage collector would actually have to come in and free that memory for you. So you had no control over when the memory is freed. And that can be problematic, for instance, if you are memory mapping very large files and then on the next loop iteration you want to memory map the next file, for instance, if you're implementing a database.

You need to be sure that on the next iteration of the loop, when you want to map the next file, that you actually have address space available in your process. But you can't really rely on that if the garbage collector has to come in and clean up the memory. You need to be able to say as the developer, right now I want this memory to be freed because I want to memory map the next file before moving on.

Okay, so moving on to the foreign function API, which is the other aspect. This is really meant as a replacement for JNI that is more pure Java. So in JNI, you have to write, in order to access a native library you want to access, you have to write a separate intermediate glue code layer.

which is a native binary written in C or C++ that acts as an intermediate between your Java code and the native library you want to access. But for this new Form Function API, you can do the intermediate layer purely in Java. So that has advantages that you don't have to learn another programming language to do this. You can just do anything from Java.

But also, you don't have to set up a different part of your build pipeline to build this intermediate binary, and you don't have to figure out how to ship it. As long as your user has the native library you're trying to access installed on their system, you can access it directly with pure Java code.

Wow, all that sounds cool. Well, I've kept an eye on the evolution of this API, especially since my team maintains a hands-on lab on Panama that showcases foreign function and memory API. So I was a little bit familiar with what changed throughout each incubation and preview. But for the audience, can you please walk us through on what and why was changed between the first incubation and its current state?

Yeah, so incubation is really the experimental stage of the API.

So when we went into incubator, there were parts of the API that were just unfinished. For instance, some call shapes with native functions returning by value structs weren't implemented. So we worked on that. And then especially after we went to preview, we did a few iterations where we

rewrote and redesigned the entire programming model and really thought that out. I think at the start of Preview, it's fair to say that we had a bag of working features and it took, I think, three iterations of Preview being in Preview to really polish that into an API that stands on its own as a whole rather than just being a bag of separate features.

So one of the things that came out of that is that we looked at the programming model with regards to lifetime management of memory resources.

So typically in C, for instance, you can allocate memory with malloc and you can free it with free. And that means that every pointer has its own lifetime, really. Every memory chunk has its own lifetime. And because free is accessible for everyone, anybody can free that memory as well.

But what we ended up with is a model where we have what's called an arena that denotes an encompassing lifetime. And then anybody that has access to that arena can allocate memory in that lifetime. And then finally, when the arena is closed, which is only possible to do by the person that is the owner of the arena that has access to that object, can close and free all the memory that was allocated in it. And this is...

Also an important safety feature, because if you have multiple chunks of memory that reference each other, that will be safe to do if they share the same lifetime. That prevents situations where one of the chunks of memory is freed and then the other chunk of memory is still referencing it even though it has already gone away. So that is one of the things we polished.

And then for the foreign function API, as I said, we implemented some corner cases. Well, not corner cases. We implemented some function shapes returning by value structs.

And more recently, we've also added a fallback implementation based on the libffi library. And then there's been also many third party contributions of ports. So it's safe to say, I think now that we do support every platform with the foreign function API as well. So in that regard, we've come a long way since we started the incubator. But most of the visible changes were to the memory access API.

Wow, that's a lot of work and it's great to hear that every platform is supported. For many years, the Java developers, including myself, used GNI as the only way to invoke native code. How is the foreign function and memory API better? Yeah, so...

When you're using JNI to access a native library, you have to write this intermediate glue code library. So that is a library, a separate library written in C and C++. You have to learn a different programming language than Java to be able to write those libraries. You have to set up a separate part of your build pipeline to be able to build that library. And...

The new API avoids all that by replacing the programming model with a pure Java one. You can write all the interconnecting code between the Java code you're writing and the native library you're trying to invoke in pure Java. And that makes it...

easier to do because you don't have to set up this build pipeline. You don't have to necessarily learn a different programming language, but it also has several performance benefits such as something that we call the Panama effect, which is that because we bring more of this intermediate layer into the Java side,

it will be more visible to the JIT, right? We're writing pure Java code, it's visible to the JIT compiler. The JIT compiler can optimize that code together in the context of the surrounding Java code, where previously with JNI, you had an intermediate hop into the intermediate library and then another hop into the final library you're trying to invoke. And it's just not very transparent. Those boundaries are not transparent to the JIT compiler.

That sounds very good. I mean, not going through the extra hoops to invoke native code in Java is great from my point of view as a developer. Now, is there anything else that you can tell us about the performance impact of the API, like more than what you already shared with us?

Yes, so as I mentioned, we have the Panama effect where most of the glue code that is written in JNI is now on the Java side. So it's more transparent to the JIT compiler and can be optimized together with surrounding Java code. But some other things we spent a lot of time looking at with the help of Roland Westerlin from Red Hat, looking at loop optimizations for

loops that were using longs in the body or using longs as indexes because the new API uses longs as offset everywhere we need to make sure that worked well so we made sure that those are actually just as optimized as you would have with a normal int offsets you used to have with byte buffers

But then the most exciting performance benefit is probably around up calls, which is when native library has to call back into Java. We've managed to make that up to three times faster than with JNI. The way that works is we have the foreign linker and you can give that a method handle that points to some Java code and a foreign linker will then wrap that inside a kind of machine code stub

that is then exposed back as a C function pointer. So the C function pointer can be passed to native code, and then that native code can use the function pointer to call back into Java. And that's the part that's about 3x faster, three times faster. Wow, three times faster. That's a great improvement.

Now, even if Java features go through incubation and preview phases, there are situations when developers like to experiment with those. That's the purpose of the incubation and the preview, to experiment and give feedback mostly. Do you know any particular experiments where foreign function and memory API was introduced, helped, or promises to help significantly?

Yes, so there were, I'm really glad, I want to say upfront, I'm really glad we went into incubation that early.

Because it really, like we've had many big projects try it out. So some of the big names I can say at the top of my head are Lucene, Netty. We had Tomcat, I believe. There's another project called Apache Data Sketches. And there were several like more Greenfield projects set up specifically to try out the FFM API as well.

And that provided a lot of valuable feedback. So like I said, Lucene, Netty were some of the bigger ones. I believe Lucene and Tomcat and maybe even Netty, but I'm not sure, already actually shipped versions that use the FFm API internally.

But both of Netty and Lucene used the new FFm API to replace their usage of unsafe. I believe that is a replacement use case. So it's more about being able to match the performance of unsafe

than trying to beat it because Nsafe is very bare metal. So yeah, that's not something you just beat the performance off, but we had to be fast enough to be in par. And I think we succeeded at that. And then I think Lucene also found some, we managed to make the Lucene code

I believe, safer because we have this whole lifetime management API that they were able to use to more explicitly demarcate the lifetimes of when they were done using resources and at the freedom memory. So those are some examples. Wow, that's great to see that this API already helps projects and probably it will help even more once it's finalized in JDK 22.

Apart from what we talked so far, I'm interested to know if you can share with us any other use cases, like where us as developers, we should carefully use the foreign function and memory API, like anything that we should pay attention to when using the API or to look out or not misuse actually the API.

Yes, so if you have a project that is using unsafe or if you have a project that is using JNI or you are wanting to set up a project that

uses a native library, then the new FFm API is really for you. Especially if you're using unsafe. Unsafe is, as the name says, an unsafe API that we hope as JDK developers to eventually retire. And the FFm API, especially the memory access part, is intended to replace some of the functionality in unsafe.

Even though the name says Foreign Memory Access API, the API can also be used to access Java heap memory. So you can wrap a Java array in a memory segment, or you can wrap a Java and an NIO buffer in a memory segment and then do accesses to that memory using the new API as well. Great to hear about that.

Let's just say that if somebody has a native library and would like to integrate it within some Java code, how they would start doing that? How do you start integrating a native library in your Java code using probably the foreign function of memory API, but maybe some tools?

Yeah, yes. So the best place to start if you have a native library, so there are some requirements. The native library needs to have a C interface. Then you will be able to use it with the FM API. The best place to start is with the J extract tool, which is a standalone tool. So it's not included in the JDK. It's a standalone tool that we've developed and are still developing actually.

that can be used to parse a header file of a native library and automatically generate Java wrapper functions that can be used to call that native library. So really, all you need is download Jxtract. It's on GitHub at github.com/openjk/jxtract. There's instructions in the README there for how to get it, how to download it.

Really, all you need then is the native library header file, a single command line infocation, and you get the source code or class files needed to access the library. So the idea is that you run J extract and then include the classes or source files that it generates in your project, and then you can call it from there. And it's pretty much exactly as C. It generates a header file

a class essentially so if i have some library like foo and it will jstriq will generate a foo underscore h class file you can use import static foo underscore h dot star and you can it will import essentially all the functions and you can call them as if they were c like that

That sounds very interesting. I've been using Jxtract for some time, sometimes building it myself. That's something that you can do as well, or just download it and work with it. It depends on how urgently you want to develop something with the features within the JDK early access builds as well.

Thank you very much for all this good information, Jorn. Anything else that you would like to add about Project Panama? What makes you excited about it at the moment or also in the future, if you have anything to share with us or for us to also keep an eye and follow?

Yes, so as far as the Project Panama roadmap goes, as far as the function and memory access API goes, we have finalized the API in JDK 22. So it has been moved out of preview state to being a final fully supported feature. I shouldn't say fully supported because preview features are also fully supported.

but yeah so the the foreign function of memory api is finalized in jdk 22

you will be able to use it without the --enable-preview flag. There are some parts of the API that are restricted functionality, which essentially are functionality that might crash the VM if used incorrectly. Those are guarded with an --enable-native-access flag, but that mostly...

That's mostly for parts of the Form Function API. The Form Memory Access API is pretty much pure Java. You can use it if you don't care about access to native libraries. You can just use it like that. And then, yeah, so there's the finalization in JDK 22, and then we are still working on improving JXtract. We are actually kind of rethinking the model of classes that it generates.

And we want to make it so it generates more self-contained snippets that you can more easily edit if you want to or copy-paste to somewhere else if you want to. Yeah, so that's what we're working on right now.

Thank you, Jorn. Thank you for joining us today and sharing all these insights with us. Folks, we're now wrapping up. And if you want to know more about Panama, take a look at the contact aggregated on insight.java.com. You can find their talks and blogs from accomplished engineers working on Project Panama, Jet Café, SIPs, newscasts, and many more stuff, all aimed to help you successfully take advantage of the latest Java developments.

Once again, Jorn, thank you for joining me today. And our listeners, stay tuned on the Java YouTube channel to hear more good news from Java. JORN LINDEN: Thank you for having me.