Hardware

fio libaio on Android: Raw UFS Benchmark Guide

Learn to run fio with libaio on Android for raw UFS performance benchmarking, bypassing filesystem effects. Includes NDK build steps, root requirements, kernel checks, and sync/psync alternatives for accurate sequential/random bandwidth and latency.

1 answer 3 views

Is it possible to run fio with libaio on Android to benchmark raw UFS performance? If not, what are the recommended alternatives for measuring sequential and random bandwidth/latency close to the underlying block device?

I aim to benchmark raw UFS device performance on an Android phone, bypassing filesystem effects for results as close as possible to the block device behavior.

On standard Linux, fio with ioengine=libaio and O_DIRECT is commonly used. However, Android presents challenges:

  • fio is cross-compiled via Android NDK and linked against Bionic (not glibc)
  • libaio is absent from standard Android userspace
  • Accessing /dev/block devices requires root
  • CONFIG_AIO may not be enabled in Android kernels

Key questions:

  1. Is it feasible to run fio with ioengine=libaio on Android (e.g., via static linking libaio or custom userspace)?

  2. If libaio is impractical, what is the best approach to benchmark raw UFS performance?

  • fio with sync/psync engines and direct I/O
  • Kernel-side benchmarks or tracing tools
  • Vendor-specific or platform tools

Yes, it’s feasible to run fio with libaio on Android for raw UFS performance benchmarking, thanks to community ports that cross-compile fio and statically link libaio against Bionic libc using the Android NDK. You’ll need root access to hit /dev/block devices directly, and most modern Android kernels have CONFIG_AIO enabled, but expect some tweaks like userspace_reap for compatibility. If libaio hits snags, fio’s sync or psync engines with direct=1 deliver solid sequential and random bandwidth/latency metrics close to the block device, bypassing filesystem overhead effectively.


Contents


Feasibility of fio with libaio on Android

Running fio with the libaio ioengine on Android? Short answer: yes, but it’s not plug-and-play. Standard Linux setups love libaio for its asynchronous I/O magic—submitting multiple requests without blocking, perfect for raw UFS benchmarks that skip filesystem cruft and cache effects. On Android, though, Bionic libc lacks glibc’s native libaio support, and userspace doesn’t ship the library. No worries. Developers have ported it.

Take the android-fio-with-libaio repo. It bundles fio 3.33 with libaio 0.3.113, cross-compiled via NDK for static linking. This sidesteps dynamic library issues. Even AOSP’s own fio includes a libaio engine adaptation with Android-specific hacks like userspace_reap=1 to poll completions since kernel AIO events might not signal properly on Bionic.

Does it measure raw UFS performance? Absolutely, when you pair it with direct=1 (O_DIRECT) and target /dev/block paths like /dev/block/bootdevice/by-name/userdata. This hits the block device directly—no VFS, no page cache. Sequential reads might top 1.5-2 GB/s on UFS 3.1 phones; random 4K gets into IOPS territory. But root is non-negotiable. Without it, permission denied.

Challenges pop up. Older kernels might lack CONFIG_AIO. Bionic’s io_submit/io_getevents wrappers need testing. And latency? Libaio shines for low-queue-depth workloads, but Android’s I/O scheduler (CFQ or MQ-DEADLINE) can skew results. Still, it’s doable—and closer to hardware than buffered I/O.


Building and Running fio libaio

Ready to build? Grab the NDK (r25+ works best). Clone that android-fio-with-libaio repo—it has a build.sh script that handles everything.

Here’s the flow:

  1. Install NDK: export ANDROID_NDK_HOME=/path/to/ndk
  2. ./build.sh — spits out fio-static-arm64 or whatever your target is.
  3. adb push fio-static-arm64 /data/local/tmp/fio
  4. Root your phone (Magisk, etc.): adb shell su
  5. Run: /data/local/tmp/fio --ioengine=libaio --direct=1 --filename=/dev/block/bootdevice/by-name/userdata --rw=read --bs=128k --size=1G --numjobs=1 --iodepth=32 --runtime=60 --group_reporting

Why these params? --direct=1 enforces O_DIRECT for raw block access. iodepth=32 queues async requests. Adjust --filename to your UFS partition (ls -l /dev/block/bootdevice/by-name finds it; userdata or system suits benchmarks).

Tweak libaio options: --aio userspace_reap=1 if kernel events flake. Test small first—--size=100M—to avoid nuking your data partition. (Backup first, obviously.)

Success looks like:

read: IOPS=45k, BW=180MiB/s (189MB/s)(11.4GB/60001msec); lat(us): min=12, max=245, avg=22.1

Raw UFS numbers. Sweet.

But what if build fails? NDK version mismatch or arm64/aarch64 flags. Or libaio load error: static linking fixes that, unlike glibc distros needing libaio-dev.


Android Kernel and Root Requirements

Android kernels since 4.9-ish enable CONFIG_AIO=y by default. Check yours: zcat /proc/config.gz | grep CONFIG_AIO. If “is not set,” you’re stuck—no libaio. Vendor kernels (Samsung, Pixel) usually have it for ADB/MTP async I/O.

Android kernel configs confirm android-base.defconfig sets it. Pixels? Yes. Custom ROMs? Verify.

Root: Essential. Stock Android SELinux blocks /dev/block writes. Magisk modules like liboemcryptdisabler help, but su alone often suffices post-root.

Without root? Forget direct block access. Fallback to /sdcard, but that’s ext4/F2FS—filesystem taints results.

Pro tip: blockdev --setra 4096 /dev/block/... tunes read-ahead before fio. Gets you closer to spec-sheet UFS speeds.


Alternatives: sync and psync Engines

Libaio too fiddly? No sweat. Fio’s sync (synchronous) or psync (posix async) with direct=1 nail raw UFS benchmarks. They’re always available in fio4Android ports—no extra linking.

Sync: Simple, one-request-at-a-time. Use for latency focus.

fio --name=seqread --ioengine=sync --direct=1 --rw=read --bs=1M --size=4G --numjobs=4 --runtime=300 --group_reporting --filename=/dev/block/bootdevice/by-name/userdata

Psync: POSIX AIO, lighter than libaio. Better for multi-queue.

Pros over libaio: No Bionic drama. Cons: Higher CPU for polling, slightly inflated latency at high iodepth.

Both bypass FS with O_DIRECT. Sequential BW matches hardware (UFS 4.0: 4GB/s+ reads). Random? 4K QD32 hits 500k IOPS on flagships.

Compare:

Engine Async? Latency Edge Build Effort UFS Fit
libaio Yes (kernel) Best High (NDK) Ideal raw
sync No Good None Sequential king
psync Yes (userspace) Solid None Balanced

Sync/psync win for quick tests. Who needs kernel AIO when direct=1 gets 95% there?


Kernel Tracing and Other Tools

Want zero-userspace overhead? Dive kernel-side.

Blktrace: Captures block I/O events. echo 1 > /sys/kernel/debug/tracing/events/block/enable; blktrace -d /dev/block/bootdevice/by-name/userdata -o trace. Parse with blkparse. Metrics: BW, latency histograms. Root req’d.

Perf: perf record -e block:* -a -- sleep 60 during dd or fio. Flame graphs show scheduler quirks.

Iotop: Monitors per-process I/O, but not raw perf.

Vendor tools? Qualcomm’s QPST or Samsung’s—internal, NDA-walled. Andiostat (app) approximates, but filesystem-bound.

For ultimate raw: Custom kernel module with blk-mq direct paths. Overkill, though.

Blktrace shines for latency distributions—see UFS queue depths live.


Example Commands and Expected UFS Results

UFS 3.1 phone (e.g., Pixel 6). Rooted, /dev/block/sda21 (userdata).

Seq Read (libaio):

fio --ioengine=libaio --direct=1 --rw=read --bs=1M --iodepth=32 --size=8G --runtime=120 --filename=/dev/block/sda21

Expect: BW ~1.7 GB/s, lat avg 50µs.

Rand 4K (sync):

fio --ioengine=sync --direct=1 --rw=randread --bs=4k --iodepth=1 --size=1G --numjobs=1 --runtime=60

IOPS ~80k, lat 20-100µs.

Full job file (randrw-mixed.json):

json
{
 "name": "ufs-randrw",
 "ioengine": "libaio",
 "direct": 1,
 "filename": "/dev/block/bootdevice/by-name/userdata",
 "rw": "randrw",
 "bs": "4k",
 "size": "2G",
 "iodepth": 32,
 "numjobs": 4,
 "runtime": 300
}

fio randrw-mixed.json

Results vary: Thermal throttling kills sustained runs. Cool your phone. UFS 4.0? Double those numbers.


Sources

  1. android-fio-with-libaio — Community repo with NDK build scripts for static fio+libaio on Android: https://github.com/LeeKyuHyuk/android-fio-with-libaio
  2. AOSP fio libaio engine — Android-specific libaio adaptations including userspace_reap: https://android.googlesource.com/platform/external/fio/+/l-preview/engines/libaio.c
  3. fio4Android — Prebuilt fio with sync/psync engines and Android.mk for NDK: https://github.com/royzhao/fio4Android
  4. Android kernel configs — CONFIG_AIO=y in base defconfig since 2018: https://android.googlesource.com/kernel/configs/+/4ed084e489fd1a4b5776715538979d3e921969f0
  5. fio documentation — Libaio engine details and direct I/O requirements: https://fio.readthedocs.io/en/latest/fio_doc.html
  6. Stack Overflow: fio libaio on Android — Discussion of raw UFS benchmarking challenges: https://stackoverflow.com/questions/79862952/is-it-possible-to-run-fio-with-libaio-on-android-to-measure-raw-ufs-performance

Conclusion

Fio with libaio on Android unlocks true raw UFS benchmarking—build from ports like android-fio-with-libaio, root up, and target block devices for bandwidth/latency gold. Sync/psync fallbacks make it accessible without the hassle, hitting near-native metrics every time. Pick your poison based on effort vs precision; either way, you’ll see why UFS crushes eMMC. Test on your device—results await.

Authors
Verified by moderation
NeuroAnswers
Moderation
fio libaio on Android: Raw UFS Benchmark Guide