-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full basic multilingual plane unifont #2535
Comments
There is an external project which did this job: https://github.com/stgiga/UnifontEX/blob/main/UnifontExMonoU8G2.c In general you can create u8g2 font files by yourself with "bdfconv.exe": Lines 255 to 291 in 4b17158
|
That solved my problem. But in my opinion there should be a pre-made full basic multilingual plane font. I guess I'll change my issue to that. |
I guess most embedded systems don't have the memory for this |
ESP32 development boards typically have 4MB of flash. The whole basic multilingual plane takes up 2034754 bytes. |
Pruning combining characters and other non-characters seems like a good way to save a little bit of space, since they can't be rendered by U8g2 anyway. The original character will still be visible without diacritics or whatever. Here's what I've pruned so far:
These may also be pruned (in whole or parts)
|
so, what had been the remaining size of the unifont then? |
Hello, UnifontEX developer here: There ARE Arduinos with 8MiB of RAM, like the Portenta H7. Also I had to use a specific version of bdfconv. Converting everything was done so that even song titles with emoji (which DO exist) can be displayed. You're welcome to compile it yourself as you need, I just did everything for the sake of completeness. Also, I targeted MANY more formats than regular Unifont does, in fact even this library, as well as its siblings. I really do want Unicode dot-matrix LCDs/VFDs/OLEDs. I just figured I'd chime in here. |
I tried to preserve combining characters that are large and to the side of characters. I wrote a little JS to generate the ranges. const exclusions = [
[0x0300, 0x036F], // Combining Diacritical Marks
[0x1AB0, 0x1AFF], // Combining Diacritical Marks Extended
[0x1CD0, 0x1CD2], // Vedic Extensions
[0x1CD4, 0x1CE8], // Vedic Extensions
[0x1CED, 0x1CED], // Vedic Extensions
[0x1CD4, 0x1CF4], // Vedic Extensions
[0x1CF8, 0x1CF9], // Vedic Extensions
[0x1CFB, 0x1CFF], // Vedic Extensions
[0x1DC0, 0x1DFF], // Combining Diacritical Marks Supplement
[0x20D0, 0x20FF], // Combining Diacritical Marks for Symbols
[0x2DE0, 0x2DFF], // Cyrillic Extended-A
[0xA802, 0xA802], // Syloti Nagri
[0xA806, 0xA806], // Syloti Nagri
[0xA80B, 0xA80B], // Syloti Nagri
[0xA825, 0xA826], // Syloti Nagri
[0xA82C, 0xA82F], // Syloti Nagri
[0xA8B6, 0xA8B6], // Saurashtra
[0xA8C4, 0xA8CD], // Saurashtra
[0xA8DA, 0xA8DF], // Saurashtra
[0xA8E0, 0xA8F1], // Devanagari Extended
[0xA8FF, 0xA8FF], // Devanagari Extended
[0xA926, 0xA92D], // Kayah Li
[0xA947, 0xA95E], // Rejang
[0xA980, 0xA982], // Javanese
[0xA9B3, 0xA9B3], // Javanese
[0xA9B6, 0xA9B9], // Javanese
[0xA9BC, 0xA9BD], // Javanese
[0xA9CE, 0xA9CE], // Javanese
[0xA9DA, 0xA9DD], // Javanese
[0xA9E5, 0xA9E5], // Myanmar Extended-B
[0xA9FF, 0xA9FF], // Myanmar Extended-B
[0xAA28, 0xAA2E], // Cham
[0xAA31, 0xAA32], // Cham
[0xAA35, 0xAA3F], // Cham
[0xAA43, 0xAA43], // Cham
[0xAA4C, 0xAA4C], // Cham
[0xAA4E, 0xAA4F], // Cham
[0xAA5A, 0xAA5B], // Cham
[0xAA7C, 0xAA7C], // Myanmar Extended-A
[0xAAB0, 0xAAB0], // Tai Viet
[0xAAB2, 0xAAB4], // Tai Viet
[0xAAB7, 0xAAB8], // Tai Viet
[0xAABE, 0xAABF], // Tai Viet
[0xAAC1, 0xAAC1], // Tai Viet
[0xAAC3, 0xAADA], // Tai Viet
[0xAAEC, 0xAAED], // Meetei Mayek Extensions
[0xAAF6, 0xAAFF], // Meetei Mayek Extensions
[0xABE5, 0xABE5], // Meetei Mayek
[0xABE8, 0xABEA], // Meetei Mayek
[0xABED, 0xABEF], // Meetei Mayek
[0xABFA, 0xABFF], // Meetei Mayek
[0xD800, 0xDB7F], // High Surrogate
[0xDB80, 0xDBFF], // High Private Use Surrogates
[0xDC00, 0xDFFF], // Low Surrogates
[0xE000, 0xF8FF], // Private Use Area
[0xFE20, 0xFE2F] // Combining Half Marks
];
exclusions.sort((a, b) => a[0] - b[0]);
console.log('Sorted exclusion ranges:');
console.log(exclusions);
const merged = [exclusions[0]];
for(let i = 1; i < exclusions.length; i++){
const lastRange = merged[merged.length - 1];
const currentRange = exclusions[i];
if(currentRange[0] <= lastRange[1] + 1){
lastRange[1] = Math.max(lastRange[1], currentRange[1]);
}else{
merged.push(currentRange);
}
}
console.log('Merged exclusion ranges:');
console.log(merged);
let range = '0-';
for(let exclusion of merged){
range += String(exclusion[0] - 1) + ',' + String(exclusion[1] + 1) + '-';
}
range += '65535';
console.log('bfdconv ranges:');
console.log(range); This got it down to 2023516 bytes. Honestly removing combining characters isn't worth saving the space more than it is to fix font rendering by ignoring them. |
UnifontEX has the SMP in it, and the way it fits it under 65535 characters (the base versions used is a factor too) is by removing ALL black hex box placeholders, which allows Plane 1 to fit. |
Also the LVGL version of UnifontEX is 2MiB. |
Plane 1. Oh and UnifontEX also has some Plane 2 and Plane 3 Han characters (what Westerners would call Chinese characters, and what Japanese users would call Kanji.) Most emoji live in Plane 1. Most "Fancy Text" (as the West calls it) lives in Plane 1. Musical notation lives in Plane 1. |
Can UnifontEX be used in u8g2? |
Yes, and I've made a version for it, though it's 6MiB, so it effectively requires an Arduino Portenta H7. But I had converted the whole font. It's the C file that's 6MiB, so the compiled version should be easier: |
I'd love to see what the finished music player looks like. |
This is just a proof of concept UI. poc.mp4 |
UnifontEX actually supports Also I'm loving what you have, it looks so cool! It reminds me of a car music display. What you've made so far is very beautiful. |
Honestly, this is exactly one of the intended use cases. |
Probably this is known, but one limitation in u8g2 is, that only base plane (plane 0) is supported as of now. |
@stgiga How big would UnifontEX be if only the BMP is included? |
It generated fine lol. Also, most characters are in the BMP so the savings just ain't there. |
I specifically used this converter: This specific build did NOT give assert errors when trying to do the entire font, yet RLE still worked. It seems that the other versions of bdfconv have trouble during the RLE step when dealing with the whole font unabridged. If you open the UnifontEX U8G2 C file I provide in my UnifontEX repo, it says that it converted ALL 65414 characters in the BDF, the Plane 1, 2, 3, and 14 stuff included. So U8G2's format supports stuff above Plane 0, but if olikraus is correct, the actual library won't display any of it. The Arduino Portenta H7 is an Arduino with 8MiB of RAM and 16MiB of flash memory, but it's still an Arduino. Now, the ONLY thing that has a chance at running the Adafruit_GFX version is the Portenta X8, which is more-or-less a Raspberry Pi and Arduino fusion (it can run Linux). Is that even an Arduino anymore? At least the Portenta H7 is a more-conventional Arduino, but it just has a LOT more memory. And yes, I checked to make sure the display libraries I target support it. Basically, U8G2 UnifontEX can work if olikraus enables stuff above Plane 0, and if the Arduino you use is a Portenta H7 or Portenta X8, assuming you don't do anything over a Raspberry Pi GPIO. Also bdfconv outputs UCGLIB, and UnifontEX exported as THAT will also fit in a Portenta H7's RAM, but with a lot less breathing-room than the U8G2 version. The LVGL version may run on a non-Pro (Portentas are Arduino's pro line) Arduino since it's only 2MiB when compiled. The Adafruit_GFX version is in the C Out of the four display libraries I support (U8G2, UCGLIB, LVGL, and Adafruit_GFX), the most ideal one is the LVGL version. Keep in mind that different libraries support different displays. For the people who think even an Arduino with 2MiB of RAM (LVGL) is too much to put in your project, there is a fifth way of LCD usage (other than the BDF), and that is using the TTF2PNG version in a character generator IC. You know those ER3301 font ICs you can buy, well, UnifontEX flashed to a 1MiB SPI flash chip like you can buy from Microchip Technology would be the same package but have MANY more characters available, and I'd outright just buy and flash a bunch and then make them available somewhere as a new font IC that supports pretty much the vast majority of Unicode. The circuitry to display its contents would be up to you, but would likely involve a DEFLATE decoder IC too. Nothing too wild though. Basically, if you don't like the overhead of using an Arduino but you want a dot-matrix Unicode LCD/VFD/OLED, then there are options. If you want a VFD, I should mention that the VFD (and to a lesser extent other technologies) company Noritake makes VFDs (getting fancy with their other technologies takes a bit more convincing) that you can bake in a 16x16 font of your choice into the firmware of, AND you can customize the driver circuitry AND there is no minimum order quantity, so for 3 years I've wanted to order a VFD from Noritake that has UnifontEX as the display font, no extra hardware required, and I'd get it in that beautiful green glow. Unfortunately finding a non-VFD analogue to this was not successful because all the character LCD and character OLED people are still obsessed with 5x7, which just ain't enough. Let's just say that I'm all for more-or-less legally obsoleting said 5x7 text-only displays in favor of ones that use UnifontEX for the purposes of better language support. And yes, Noritake provides Arduino stuff for their displays. The best base display of theirs you could use would be this one https://www.noritake-elec.com/products/model?part=GU256X128D-D903M which even has touch support. I wish I had the funds to actually do any of this. |
My project involves displaying song titles and artists, and I'm looking for a font that has the most coverage. The most important characters for me are characters with diacritics, Japanese, and Cyrillic. The closest ones I've found are:
Unifont, which doesn't have one font with all the characters,
Efont, which doesn't have diacritics, and
Boutique, which is too small (should be 15/16px tall).
Would it be possible to combine all the Unifont fonts into one big font? A 500kB monstrosity wouldn't really be a problem for me.
The text was updated successfully, but these errors were encountered: