mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.1K
active users

#kqueue

0 posts0 participants0 posts today
Replied in thread

@jan funnily, the lack of #POSIX #timers in #OpenBSD inspired me to come up with a timer implementation supporting different backends. I was annoyed at first, but didn't regret it. The interface offered by POSIX timers is really clumsy, they can either send some signal or (even worse) launch a thread. My current code only uses them on #illumos, which offers a better signaling mechanism on an "event port". Where available, #kqueue is used for timers. the next fallback is #Linux' #timerfd. And finally, as a last resort, some manual multiplexing on top of #setitimer (cause it's not much worse than "vanilla" POSIX timers).

tl;dr, might be an alternative to contribute code to upstream enabling them to use better platform interfaces for timers...

Please help me spread the link to #swad 😎

github.com/Zirias/swad

I really need some users by now, for those two reasons:

* I'm at a point where I fully covered my own needs (the reasons I started coding this), and getting some users is the only way to learn about what other people might need
* The complexity "exploded" after supporting so many OS-specific APIs (like #kqueue, #epoll, #eventfd, #signalfd, #timerfd, #eventports) and several #lockfree implementations based on #atomics while still providing fallbacks for everything that *should* work on any #POSIX systems ... I'm definitely unable at this point to think of every possible edge case and test it. If there are #bugs left (which is somewhat likely), I really need people reporting these to me

Thanks! 🙃

Simple Web Authentication Daemon. Contribute to Zirias/swad development by creating an account on GitHub.
GitHubGitHub - Zirias/swad: Simple Web Authentication DaemonSimple Web Authentication Daemon. Contribute to Zirias/swad development by creating an account on GitHub.

I now implemented a per-thread #pool to reuse #timer objects in #poser (my lib I use for #swad).

The great news is: This improved performance, which is an unintended side effect (my goal was to reduce RAM usage 🙈😆). I tested with the #kqueue backend on #FreeBSD and sure, this makes sense: So far, I needed to keep a list of destroyed timers that's always checked to solve an interesting issue: By the time I cancel a timer with #kevent, the expiry event might already be queued, but not yet read by my event loop. Trying to fire events from a timer that doesn't exist any more would segtfault of course. Not necessary any more with the pool approach, the timer WILL exist and I can just check whether it's "alive".

The result? Same hardware as always, and now swad reaches a throughput of 26000 requests per second with (almost) perfect response times. 🥳

I'm still not happy with memory usage. It's better, and I have no explanation for what I oberved now:

Ran the same test 3 times, 1000 #jmeter threads each simulating a distinct client running a loop for 2000 times doing one GET and one POST for a total of 4 million requests. After the first time, the resident set was at 178MiB. After the second time, 245 MiB. And after the third time, well, 245 MiB. How ...? 🤯

Also, there's another weird observation I have no explanation for. My main thread delegates accepted connections to worker threads simply "round robin". And each time I run the jmeter test, all these worker threads show increasing CPU usage at a similar rate, until suddenly, one single thread seems to do "more work", which stabilizes when this thread is utilizing almost double the CPU as all other worker threads. And when I run the jmeter test again (NOT restarting swad), the same happens again, but this time, it's a *different* thread that "works" a lot more than all others.

I wonder whether I should accept scheduling, memory management etc. pp are all "black magic" and swad is probably "good enough" as is right now. 😆

Getting somewhat closer to releasing a new version of #swad. I now improved the functionality to execute something on a different worker thread: Use an in-memory queue, providing a #lockfree version. This gives me a consistent reliable throughput of 3000 requests/s (with outliers up to 4500 r/s) at an average response time of 350 - 400 ms (with TLS enabled). For waking up worker threads, I implemented different backends as well: kqueue, eventfd and event-ports, the fallback is still a self-pipe.

So, #portability here really means implement lots of different flavors of the same thing.

Looking at these startup logs, you can see that #kqueue (#FreeBSD and other BSDs) is really a "jack of all trades", being used for "everything" if available (and that's pretty awesome, it means one single #syscall per event loop iteration in the generic case). #illumos' (#Solaris) #eventports come somewhat close (but need a lot more syscalls as there's no "batch registering" and certain event types need to be re-registered every time they fired), they just can't do signals, but illumos offers Linux-compatible signalfd. Looking at #Linux, there's a "special case fd" for everything. 🙈 Plus #epoll also needs one syscall for each event to be registered. The "generic #POSIX" case without any of these interfaces is just added for completeness 😆

Replied in thread

@meka

One thing not strictly select()-related:

EVFILT_PROC with NOTE_TRACK is another gotcha. It looks great at first glance.

Until one really hammers it with lots of short-lived processes in a deep tree.

Then it becomes apparent that NOTE_EXIT and NOTE_CHILD both use the data field for different purposes, but get merged, and grandchildren get lost from their parents.

news.ycombinator.com/item?id=2

news.ycombinator.comkevent() is indeed the MsgWaitForMultipleObjects() of the Unix world, a very use... | Hacker News
Replied in thread

@meka

I'm currently working with an old kernel, and I don't know off the top of my head if/when this got fixed. But one of the things is that attempting to add an EVFILT_READ for a file descriptor that is /dev/null results/resulted in an ENODEV EV_ERROR event.

It's generally edge case devices that didn't get attention because a lot of the relevant software still used select() and so no-one (apart from kevent()-everywhere people like me) really hit this.

Replied in thread

@meka

It is always welcome to see more kevent(), if only because it lets other people share my pain, in the hope that that increases the push for kevent() to be fully completed and as good as select().

There are a number of cases I have hit over the years kevent() cannot *quite* do what select() does.

Replied in thread

@jhx Regarding that, at least in theory, it's indeed "truly portable" as it works fine using only #POSIX compliant APIs.

In practice, there can be issues with platforms that don't implement the *full* POSIX feature-set (which is in fact most platforms nowadays). There can also be nasty issues with how feature-test macros are handled (set by the compiler, interpreted by the system's headers) and sometimes with which libraries are needed (unfortunately, POSIX doesn't specify that, e.g. on illumos, you have to link a libsocket for any socket functionality 🙄).

Once I started to add optional support for the platform-specific mechanisms #epoll on #Linux and #kqueue on #BSD (because the POSIX standard select and poll have severe scalability issues), I wanted to also add support for /dev/poll as used on solaris, that's why I installed #OpenIndiana (illumos-based) in a VM to do tests, and I quickly learned /dev/poll was superseded by "event ports", so that's what I added instead.

Continued thread

Unfortunately, I had to do a bugfix release: #swad 0.8

Although I didn't observe any obvious misbehavior on my own installation for several days, I discovered two very relevant bugs just after release of 0.7 🤦‍♂️ -- one of them (only affecting #kqueue, for example on #FreeBSD) even critical because it could trigger "undefined behavior".

Both bugs are regressions from new (performance) improvements added, one from trying to queue as many writes as possible when sending HTTP responses, one from using kqueue to provide timers and signals.

See release notes for 0.8. Don't use 0.7. Sorry 🤪

github.com/Zirias/swad

Now that #swad 0.7 is released, it's time to prepare a new release of #poser, my own lib supporting #services on #POSIX systems, following a #reactor with #threadpool design.

During development of swad, I moved poser from using strictly only POSIX APIs (with the scalability limits of e.g. #select) to auto-detected support for #kqueue, #epoll, #eventports, #signalfd and #timerfd (so now it could, in theory(!), "compete" with e.g. libevent). I also fixed quite some hidden bugs, and added more base functionality, like a #dictionary using nested hashtables internally, or #async tasks mimicking the async/await pattern known from e.g, #csharp. I also deprecated two features, the periodic and global "service tick" (superseded by individual timers) and the "resolve hosts" property of a "connection" (superseded by a separate resolve class).

I'll have to decide on a few things, e.g. whether I'll remove the deprecated stuff immediately and bump the major version of the "posercore" lib. I guess I'll do just that. I'd also like to add all the web-specific stuff (http 1.0/1.1 server) that's currently part of the swad code as a "poserweb" lib. This would get a major version of 0, indicating a generally unstable API/ABI as of now....

And then, I'd have to decide where certain utility classes belong to. The rate limiter is probably useful for things other than web, so it should probably go to core. What about url encoding/decoding, for example? 🤔

Stay tuned, something will come here, maybe helping you to write a nice service in plain #C 😎:

github.com/Zirias/poser

Replied in thread

@Toasterson @astade I'm downloading an #OpenIndiana image right now, we will see.

For Linux, I now have support for #epoll (fd events), and additionally #signalfd and #timerfd. On the BSD's that have #kqueue, it covers all these aspects. So, little question here: Are there any solaris / #illumos APIs specifically for signal handling and individual timers I should look into? Or should I stick to classic async handlers (signals) and setitimer (timers) on that platform, and just have a look at /dev/poll?

Replied in thread

@Toasterson @astade Yeah, my main use case is socket notification, the lib offers #select and #poll as a base working across #POSIX, and I added #epoll for Linux and #kqueue for FreeBSD ... and as kqueue can do a lot more, I recently also implemented using it for signal handling.

I normally prefer my own local infrastructure, already fixed my Linux VM yesterday to test thoroughly with epoll (and, MAYBE, add signalfd support). So I might just install e.g. #OpenIndiana to also play around with /dev/poll. Had a look at an old SunOS manpage, seems simple enough to use 😉

The next release of #swad will probably bring not a single new feature, but focus on improvements, especially regarding #performance. Support for using #kqueue (#FreeBSD et al) to handle #signals is a part of it (which is done and works). Still unsure whether I'll also add support for #Linux' #signalfd. Using kqueue also as a better backend for #timers is on the list.

Another hopefully quite relevant change is here:

github.com/Zirias/poser/commit

In short, so far my #poser lib was always awaiting readiness notification (from kqueue, or #epoll on Linux, or select/poll for other platforms) before doing any read or write on a socket. This is the ideal approach for reads, because in the common case, a socket is NOT ready for reading ... our kernel must have received something from the remote end first. But for writes, it's not so ideal. The common case is that a socket IS ready to write (because there's space left in the kernel's send buffers). So, just try it, and only register for notifications if it ever fails, makes more sense. Avoids pointless waiting and pointless events, and e.g. with epoll, even unnecessary syscalls. 😉

Continued thread

I think I'll "go #kqueue all the way" now, by also using it for providing my #timers if available. The current implementation multiplexes them on top of #setitimer, that's the portable variant that will stay as a fallback, but with kqueue, there's no need to have all these #SIGALRM generated, and I can avoid the slight imprecision by multiplexing in userspace 😎

Continued thread

Ok, I forgot about restoring the previous state, cause I already broke this when implementing the generic signal handling via flags. And restoring everything to default must be good enough. Then, using #kqueue was kind of easy. Just this one weirdness that I'm not allowed to ignore #SIGCHLD, so I have to block this instead ... anyone has any idea why?

Still thinking whether I should also add support for #signalfd ... unfortunately different semantics, for that I should (according to its manpage) *block* signals, not ignore them. But maybe I should do some tests there as well.

Hmm. Now that I have a working "generic" #signal handling in #poser, I'd like to optimize a bit by picking up @david_chisnall's suggestion to use #kqueue for the job if available. Would have a clear advantage: No need to fiddle with the signal mask around every call to #kevent.

I still had the doubt whether a signal delivered via kqueue would still remain pending when it is just blocked, so I wrote some little test code and the unfortunate answer is: yes. Unfortunate because I want my library code to restore everything as it was found (signal mask and handlers) on exit, but I certainly don't want a batch of spurious signals handled when unblocking them.

Kind of obvious solution: Set the signals temporarily to ignored when unblocking them, as shown in the screenshot. Now I have the next doubt: Is it guaranteed to have pending signals delivered instantly when unblocking them? 🤔

Still working on #swad, and currently very busy with improving quality, most of the actual work done inside my #poser library.

After finally supporting #kqueue and #epoll, I now integrated #xxhash to completely replace my previous stupid and naive hashing. I also added a more involved #dictionary class as an alternative to the already existing #hashtable. While the hashtable's size must be pre-configured and collissions are only ever resolved by storing linked lists, the new dictionary dynamically nests multiple hashtables (using different bits of a single hash value). I hope to achieve acceptable scaling while maintaining also acceptable memory overhead that way ...

#swad already uses both container classes as appropriate.

Next I'll probably revisit poser's #threadpool. I think I could replace #pthread condition variables by "simple" #semaphores, which should also reduce overhead ...

github.com/Zirias/swad

Replied in thread

@feld @pertho Doesn't feel like something I'd like to try though. It would be pretty much #FreeBSD-specific, but worse than that, you shouldn't rely on dtrace availability, as it's an optionally loadable profiling tool (so, it would also be a "misuse" regarding purpose). Related to that, I'm pretty sure it requires superuser privileges for everything, which would be another issue for some general-purpose application software.

No, the "canonical" solution for filesystem watching on a BSD system is indeed #kqueue. And unfortunately, it does fall short a bit compared to #Linux' #inotify. For my #xmoji tool, I wanted notifications about any change on the runtime configuration file, and additionally to the pure #POSIX solution of periodically calling stat() (which is stupid, but still works for a single file), I implemented backends for both inotify and kqueue. For just a single file, kqueue's requirement of having an open file descriptor is just a minor annoyance, but you can deal with that. Note it's not as simple as it sounds in any case, e.g. when the file is deleted, you want to watch the directory of course, so you learn when it's re-created ... which with kqueue requires opendir() 🙈 ... still doable. But for scenarios where you want to watch a whole tree with potentially lots of files and directories, this is really bad and #inotify really shines.

@pertho The only little drawback compared to epoll is the lack of atomic signal mask setting, so you need a bit more code and a thoughtful structure to handle signals in the same loop. Apart from that, indeed much better than #epoll.

Unfortunately, it's not beter than inotify (for a completely different purpose, #file #monitoring ... kqueue covers them all). With #inotify, you can for example set a #watch by path, while #kqueue requires opened file descriptors. 😞