Mastodon.world admins @mwadmin

**argv minus one** @argv_minus_one@mastodon.sdf.org · 3d

argv minus one @argv_minus_one@mastodon.sdf.org

The number of code pages defined by #IBM for its mainframes and #IBMPC is absolutely staggering.

Take a look: https://en.wikipedia.org/wiki/Code_page#IBM_code_pages

Thank goodness for #Unicode!

en.wikipedia.orgCode page - Wikipedia

#i18n

**Sven** @doctorwhom@mastodon.social · 4d

Sven @doctorwhom@mastodon.social

So "U+1F4D3 NOTEBOOK" or "U+1F4D4 NOTEBOOK WITH DECORATIVE COVER" which is better? But both are necessary for international language communication.

#Unicode #Emoji

**Terence Eden’s Blog** @blog@shkspr.mobi · Jul 25, 2022 *

Jul 25, 2022 *

Terence Eden’s Blog @blog@shkspr.mobi

The (Mostly) Complete Unicode Spiral

https://shkspr.mobi/blog/2022/07/the-mostly-complete-unicode-spiral/

I present to you, dear reader, a spiral containing every0 Unicode 14 character in the GNU Unifont. Starting at the centre with the control characters, spiralling clockwise through the remnants of ASCII, and out across the entirety of the Basic Multi Lingual Plane. Then beyond into the esoteric mysteries of the Higher Planes1.

Zoom in for the massiveness. It's a 10,000x10,000px image. Because the Unifont displays individual characters in a 16x16px square, it is quite legible even when printed out on a domestic laser printer at 600dpi:

Terence Eden is on Mastodon
@edent

Replying to @edentStill a Work In Progress.
Created a proper spiral of every Unicode character in the Unifont.
At 600dpi, it is *just about* legible.
The full thing would need to be printed on A2 sized paper! pic.x.com/jsrofjt4xd 10 3 0 14:09 - Fri 08 July 2022

I also made it as a square spiral - which fits into a smaller space.

Again, printed out at 600dpi it is readable. Just!

Terence Eden is on Mastodon
@edent

Replying to @edentPrinted out on A4 @ 600dpi.
Amazingly, it is just about legible!
Still a bit more work to do, but quite pleased with the results so far. pic.x.com/d5kwfwhq27 1 1 0 07:18 - Tue 05 July 2022

Printed onto A0 - 841mm square - it's a bit better. The ASCII set is readable:

But characters in CJK weren't particularly legible:

If I wanted the 16px symbols to each be 5mm wide, I'd need to print this on paper over 3 metres wide!

WHY??!?

Because visualising one-dimensional data structures in two-dimensional space is fun! That's why

I was inspired by seeing two lovely piece of artwork recently.

The first was 2015's Unicode in a spiral by Reddit user cormullion.(Click to embiggen.)

It's gorgeous, but doesn't include all characters. Oh, and you also have to rotate your head to read each character.

There's a larger version which covers a lot more of the Basic Multilingual Plane It's an 18MB PDF. And, because of the resolution of the resolution of the font, it needs to be printed out on a 1 metre square at a minimum.

The second interesting thing I found was a 2016 Hilbert Curve of Unicode:

smly
@smly

UNICODE in the picture frame. The placement of characters along the Hilbert curve is beautiful. Original: github.com/hakatashi/unic… pic.x.com/f69hwyzlvc 5 0 0 11:18 - Thu 26 October 2017

The Hilbert Curve poster is beautiful. But it only goes up to Unicode 10 - and we're on Unicode 14 by now. Despite the æsthetically pleasing nature of fractal curves, I find them quite un-intuitive.

Neither show off the gaps in Unicode. That is, where there is space to fit more symbols.

So I wanted to do something which satisfied these criteria:

Contained all of Unicode 14
Was legible at a small size
Showed spaces where there are empty sections
Readable without tilting one's head
Slightly more visually interesting than a grid

HOW?!?!

I've written before about the wonders of the Unifont. It contains all of the Unicode 14 glyphs - each squeezed down into a 16x16px box. Even emoji!

Well. Mostly…

Limitations

Although I wanted every character, there are some practical problem. Firstly:

Unifont only stores one glyph per printable Unicode code point. This means that complex scripts with special forms for letter combinations including consonant combinations and floating vowel marks such as with Indic scripts (Devanagari, Bengali, Tamil, etc.) or letters that change shape depending upon their position in a word (Indic and Arabic scripts) will not render well in Unifont.

So there are some scripts which will look a bit ugly. And some characters which won't be well represented.

The second issue is one of size. Some of the newer characters are simply too big:

Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.

That means it misses out on characters like 𒀰, 𒁏 and, of course, 𒀱. Which, to be fair, would be hard to squeeze in!

The third problem is that Unicode is updating all the time. Although the Unifont is at Version 14 - Python's Unicode Database is stuck at V13. Luckily, there is a library called UnicodeData2 which includes V14.

But, given those limitations, I thought it was possible to craft something nice.

Python Code

I split the problem into several parts.

Plotting equidistant points along a spiral

As ever, I turned to StackOverflow and found a neat little solution:

def spiral_points(arc=1, separation=1):    #   Adapted from https://stackoverflow.com/a/27528612/1127699    """generate points on an Archimedes' spiral with `arc` giving the length of arc between two points and `separation` giving the distance between consecutive turnings    - approximate arc length with circle arc at given distance    - use a spiral equation r = b * phi    """    def polar_to_cartesian(r, phi):        return ( round( r * math.cos(phi) ),                 round( r * math.sin(phi) )               )    # yield a point at origin    yield (0, 0)    # initialize the next point in the required distance    r = arc    b = separation / (2 * math.pi)    # find the first phi to satisfy distance of `arc` to the second point    phi = float(r) / b    while True:        yield polar_to_cartesian(r, phi)        # advance the variables        # calculate phi that will give desired arc length at current radius (approximating with circle)        phi += float(arc) / r        r = b * phi

Drawing a squaril

I wanted a grid which looked like this:

9 A B8 1 27 0 36 5 4

I found a blog post and source code for a spiral array. It's pretty simple - although I'm sure there's lots of ways to do this:

n = 12nested_list= [[0 for i in range(n)] for j in range(n)]low=0high=n-1x=1levels=int((n+1)/2)for level in range(levels):    for i in range(low,high+1):        nested_list[level][i]= x        x+=1    for i in range(low+1,high+1):        nested_list[i][high]= x        x+=1    for i in range(high-1,low-1,-1):        nested_list[high][i]= x        x+=1    for i in range(high-1,low,-1):        nested_list[i][low]= x        x+=1    low+=1    high-=1for i in range(n):    for j in range(n):        print(nested_list[i][j],end="\t")# print the row elements with        # a tab space after each element    print()# Print in new line after each row

However, that printed the spiral backwards:

B A 92 1 83 0 74 5 6

Luckily, Python makes it easy to reverse lists:

for l in nested_list :    l.reverse()

Drawing the characters

Turning a number into a Unicode character is as simple as:

unicode_character = chr(character_int)

But how do we know if the font contains that character? I stole some code from StackOverflow which uses the FontTools library:

from fontTools.ttLib import TTFontfont = TTFont(fontpath)   # specify the path to the font in questiondef char_in_font(unicode_char, font):    for cmap in font['cmap'].tables:        if cmap.isUnicode():            if ord(unicode_char) in cmap.cmap:                return True    return False

But, of course, it is a bit more complicated than that. The Unifont contains some placeholder glyphs - the little black square with hex digits in them that you see here:

I didn't want to draw them. But they exist in the font. So how do I skip them?

Using the Python Unicode Database it's possible to look up the name of a Unicode code-point. e.g. chr(65) is LATIN CAPITAL LETTER A. So if there is no name in the database, skip that character.

But, of course, it is a bit more complicated than that! The Unicode database only goes up to Unicode 13. And, for some reason, the control characters don't have names. So the code becomes a tangled mess of if...else statements. Ah well!

Drawing the characters should have been easy. I was using Pillow to draw text. But, despite the pixely nature of the font itself Pillow was performing anti-aliasing - creating unwanted grey subpixels.

I thought the fix was simple:

jonodrew@mastodon.social
@jonodrew

Replying to @xandypty @xandypty @edent draw = ImageDraw.Draw(image)
draw.fontmode = '1'
... 1 1 0 08:33 - Mon 04 July 2022

Sadly, that does introduce some other artefacts - so I've raised a bug with Pillow.

In the end, I kept the anti-aliasing, but then converted the grey pixels to black. And then converted the entire image to monochrome:

threshold = 191image = image.point(lambda p: p > threshold and 255)image = image.convert(&#039;1&#039;)

Putting It All Together

Once I'd go the co-ordinates for either the spiral or squaril, I drew the character on the canvas:

draw.text(   (x , y),   unicode_character,   font=font,   fill=font_colour)

Except it didn't work!

Sadly, Pillow can't draw non-printable glyphs - even when the font contains something drawable. This is because it can't pass the correct options to the harfbuzz library.

So, I went oldskool! I converted every glyph in the font to a PNG and saved them to disk.

from fontforge import *font = open("unifont_upper-14.0.04.ttf")for i in range( len(font) ) :    try:        font[i].export( "pngs/" + str(i) + ".png", pixelsize=16, bitdepth=1)    except Exception as e:        print ( str(i) )        print ( e )

Look, if it's hacky but it works; it isn't hacky! Right?

From there, it's a case of opening the .png and pasting it onto the canvas:

character_png = Image.open('pngs/' + str(character_int) + ".png")image.paste( character_png, (round(x) , round(y)) )

It was too big!

And now we hit the final problem. The image was over 20,000 pixels wide. Why? The Variation Selectors! The last of which is at position U+E01EF. Which means the spiral looks like this:

Here they are in close up:

So I decided to remove that block!

Source Code

All the code is on GitLab. Because GitHub is so 2019…

Licensing?

The GNU Unifont has a dual licence. GPL2 and OFL. The image is a "document" for the purposes of the OFL and the GPL font exemption. But, I guess you could reverse engineer a font-file from it. So, if you use the image to generate a font, please consider that it inherits the original licence. If you just want to print it out, or use it as art, then the image itself is CC BY-SA.

This is based on my lay-person's understanding of the various copyleft licence compatibility issues. Corrections and clarifications welcome!

What's next?

I would like to print this out on paper. At 200dpi, it would be about 1.5m squared. Which I guess is possible, but might be expensive.

At 600dpi, the square will just about fit on A3 paper. But the quality is atrocious. Even at A0 it wasn't great. Realistically, it needs to be at least 3.3 metres along each side! No idea where I can find a printer which will do that. Or where in my house I'd have space for it!

Of course, it will need updating whenever there is a new release of either Unicode or Unifont.

If you have any suggestions or feedback - please drop them in the comment box!

Well, look, it is complicated. Unicode is Hard™. ↩︎
Not to be confused with the Demonic Planes of Forbidden Unicode. ↩︎

Terence Eden’s Blog · Jul 25, 2022The (Mostly) Complete Unicode Spiral

More from

Terence Eden

#art #unicode

**Shufei** @shufei@merveilles.town · 6d *

6d *

Shufei @shufei@merveilles.town

https://meta.wikimedia.org/w/index.php?search=Vertical+writing&title=Special%3ASearch&ns0=1&ns12=1&ns200=1&ns202=1&searchToken=vtm1d09u2iyi9e668pfvpe5u

It’s 2025 and:
- There is still no vertical text site mode for #Wikimedia in any language using vertical text.
- #Wikipedia still forces “simplified” Chinese on browsers.
- There is still no true IDS or CangJie composition matrix for characters in #Unicode.
- SignWriting still has no proper #Unicode inclusion, no IDS analogue, no inventory of signs, and is still mostly written by mouse drag and drop in a mishmash of SVG and HTML.
- There is no proper SignWriting IME, such as a Rime schema.

To say this state of affairs is cultural propaganda by mass technic inertia would be an understatement. Infotech is functional colonialism. Thats really all there is to say.

Filed under #崇洋媚外

meta.wikimedia.orgSearch results for "Vertical writing" - Meta

**Martin Maciaszek** @martin@maciaszek.social · Jul 19

Jul 19

Martin Maciaszek @martin@maciaszek.social

Updated my unilookup utility. It now accepts unicode strings on stdin as well as a command line parameter. Can be installed directly from PyPi. https://github.com/fastjack/unilookup
#unicode #python

**Mela News** @MelaNews@mastodon.uno · Jul 18

Jul 18

Mela News @MelaNews@mastodon.uno

In occasione del World Emoji Day, il Unicode Consortium annuncia l'arrivo di nuovi emoji in Unicode 17. Tra i nuovi arrivi:
Trombone
Bigfoot
Orca
Apple svilupperà questi emoji, disponibili dalla prossima primavera. #WorldEmojiDay #Emoji #Unicode

Replied to argv minus one

**Mark Gardner** @mjg@nerdfight.online · Jul 18

Jul 18

Mark Gardner @mjg@nerdfight.online

@argv_minus_one @dbarros @gruber The history is complicated. I think some late 90s Japanese #emoji had different single colors per glyph, but various carriers and hardware had different sets of glyphs and didn’t interoperate. #Unicode eventually straightened that out.

It’s much like the diversity of personal computers from the late 1970s into the 1980s, or any other nascent technology before market and other forces drive uniformity and consolidation.

**screwlisp** @screwlisp@gamerplus.org · Jul 18

Jul 18

screwlisp @screwlisp@gamerplus.org

Real talk.

What are some other plant/insect/bird #unicode characters? I thought of

⸙

**Paris Web** @ParisWeb@mamot.fr · Jul 16

Jul 16

Paris Web @ParisWeb@mamot.fr

Avec @MoritzBrouhaha, découvrez l'histoire du standard informatique Unicode, utilisé par tout le monde à travers le globe dans nos communications quotidiennes.

https://www.paris-web.fr/2025/conference/a-la-decouverte-du-monde-au-travers-de-lunicode

#unicode #standards #typographie

**Design Brouhaha** @MoritzBrouhaha@typo.social · Jul 15 *

Jul 15 *

Design Brouhaha @MoritzBrouhaha@typo.social

🜰^ᯣ⥿ᯣ^🜰

there is an #Unicode proposal to make the cat paws bigger

https://www.unicode.org/L2/L2025/25125r-alchemical-glyphs.pdf

#kaomoji #kaomojicoolclub

**Paul Melis** @paulmelis@social.edu.nl · Jul 11

Jul 11

Paul Melis @paulmelis@social.edu.nl

The recycling symbol in a git branch name, what a time to be alive

Also, nice of #github to warn about possibly hidden characters, but not sure it applies in this case

https://github.com/JuliaLang/julia/pull/58418

#unicode

Replied in thread

**Dr. bar. met. Paul B.** @joschtl@karlsruhe-social.de · Jul 10

Jul 10

Dr. bar. met. Paul B. @joschtl@karlsruhe-social.de

@lritter
🯁🯂🯃 That's why I #Unicode

**Adële** @adele@social.pollux.casa · Jul 7

Jul 7

Adële @adele@social.pollux.casa

Unicode characters for Creative Commons symbols

I've just discovered that there are symbols since Unicode 13.0 for CC licences

CC: 🅭
BY: 🅯
NC: 🄏
ND: ⊜
SA: 🄎
PD: 🅮
CC0: 🄍

source

en.wikipedia.orgCreative Commons license - Wikipedia

#creativecommons #emoji #unicode

**Club de TéléMatique** @ClubTeleMatique@mstdn.social · Jul 6

Jul 6

Club de TéléMatique @ClubTeleMatique@mstdn.social

Understand UTF-8 and the others (16, 32) https://tonsky.me/blog/unicode/ #computer #reference #unicode

tonsky.me · Oct 2, 2023The Absolute Minimum Every Software Developer Must Know About Unicode in 2023 (Still No Excuses!)Modern extension to classic 2003 article by Joel Spolsky

**Abimelech B. | wörk** @abimelechbeutelbilch@fulda.social · Jul 6

Jul 6

Abimelech B. | wörk @abimelechbeutelbilch@fulda.social

Es ist 2025 und überall kennt und nutzt man #unicode und #utf - außer in diesem hartnäckig sich weigernden Kino namens #filmpalast in #karlsruhe

**Veronica Olsen** @veronica@mastodon.online · Jul 5 *

Jul 5 *

Veronica Olsen @veronica@mastodon.online

Got a bug report for @novelwriter from someone who uses Cuneiform text in their work. These are 4 byte Unicode symbols, and turned out to be very tricky to handle.

The app is built with Python, which will switch a string to UCS-4 when it contains such characters, so the characters always have a single index in the string.

However, the Qt library uses UTF-16. That means 4-byte characters use two slots, creating a mismatch in indices between the two representations.

#Python #Qt #Code

**Habr** @habr@zhub.link · Jul 1

Jul 1

Habr @habr@zhub.link

[Перевод] Руководство по эффективной локализации в Unreal Engine

Локализация — один из ключевых, но часто недооценённых аспектов разработки игр. По мере роста глобальной аудитории игроки ожидают видеть игры на своём родном языке, и локализация становится не роскошью, а необходимостью. Однако локализация — это не просто перевод текста. Она включает в себя решение технических задач, учёт культурных особенностей и оптимизацию рабочего процесса, чтобы обеспечить плавный и комфортный игровой опыт на нескольких языках. В этой статье я расскажу о сложностях локализации в Unreal Engine, опираясь на свой опыт работы над Wizard of Legend 2 . Мы разберём сбор и управление текстом, а также проблемы с форматированием, гендерно‑зависимым языком и обработкой шрифтов. Также я расскажу о ключевых аспектах, которые могут вызвать задержки, и о том, как их минимизировать.

https://habr.com/ru/companies/otus/articles/923968/

ХабрРуководство по эффективной локализации в Unreal EngineЛокализация — один из ключевых, но часто недооценённых аспектов разработки игр. По мере роста глобальной аудитории игроки ожидают видеть игры на своём родном языке, и...

#геймдев #unreal_engine #Гендерная_локализация

**Flominator** @Flominator@genealysis.social · Jul 1 *

Jul 1 *

Flominator @Flominator@genealysis.social

Fascinating: Two feeds for @hinterzarten_news couldn't be properly pasted from the website anymore, because they changed the dates from having   ("Hair Space") between the dots and the numbers, into  ("Zero Width Space"). Shout-out to the creator of https://www.mauvecloud.net/charsets/CharCodeFinder.html, which is a really helpful tool for finding out, what #character you exactly have in front of you.

www.mauvecloud.netCharacter Code Finder

#webdev #unicode #html

**Habr** @habr@zhub.link · Jul 1

Jul 1

Habr @habr@zhub.link

Главный вопрос к почте на кириллице

Почта с адресом info@пример.бел технически возможна и мы в HB.BY её поддерживаем. Но спроса почти нет. В статье разбираем, кто мечтал о кириллической почте и что от неё отталкивает, чтобы узнать, к чему всё приведёт.

https://habr.com/ru/articles/922268/

ХабрГлавный вопрос к почте на кириллице«Хуже кириллических доменов — только почта на кириллице». Примерно так написали в комментариях к нашей предыдущей статье. E-mail вида иван@пример.рф действительно выглядит непривычно. Чтобы понять,...

#unicode #eai