Righteous Wrath Online Community

General => Tech Chat => Topic started by: Thorin on September 14, 2010, 11:33:32 AM

Title: Threading is Hurting My Brain
Post by: Thorin on September 14, 2010, 11:33:32 AM
So I'm working on an application that reads from multiple serial ports.  The easiest way to do that, of course, is to call a new worker thread to handle the Data Received event each time data comes in on a serial port.  Unfortunately, the data needs to be recorded into a log, and if it's unique it needs to be recorded into a database.

Well, I've heard (and read) that Threading Is Difficulttm.  So I thought I'd go through a tutorial or two to refresh my memory on how threading works in .NET, and more specifically what the gotchas are.  I managed to find this very thorough discussion on threading in .NET, complete with C# examples (even .NET 4.0):

http://www.albahari.com/threading/

However, my brain now hurts.
Title: Re: Threading is Hurting My Brain
Post by: Lazybones on September 14, 2010, 01:59:35 PM
Hello blocking and locking.... However the performance gains on modern hardware are amazing...

Didn't get to far with it when I was still developing but I did make one or two multi-threaded applications... Debugging has to be the hardest part.
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 15, 2010, 12:28:21 AM
One alternative is using an "IO Multiplexer" which is just fancy words for "select". I'm not sure how .NET or Windows handles that though. If its possible to ask the OS to wait on several devices at once, thats an easy way out.

My game server queues up all its sockets, and select's on them at the same time.

But yes, threading can be quite hard. Especially if you need high performance and scalability. If you don't just use a nice global mutex around the common data structures and call it a day.
Title: Re: Threading is Hurting My Brain
Post by: Thorin on September 15, 2010, 09:33:37 AM
In the case of this particular application, I'm not multi-threading for performance gains, I'm doing it to be able to respond to multiple hardware signals at the same time.

In the end, I'm taking the easy way out: I catch a hardware signal on a different thread, then marshal it back to the main thread where all the processing happens.  This ensures sequential updates of member variables without having to think about locking.  Yeah, it means no performance gains, but really, this app doesn't do much besides catch hardware signals and report them in a UI.

I wouldn't mind some day working on an app that requires threading for performance, that is properly planned out, and that wasn't due before they asked me to work on it. <sigh>
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 15, 2010, 11:40:40 AM
Quote from: Thorin on September 15, 2010, 09:33:37 AM
In the case of this particular application, I'm not multi-threading for performance gains, I'm doing it to be able to respond to multiple hardware signals at the same time.

In the end, I'm taking the easy way out: I catch a hardware signal on a different thread, then marshal it back to the main thread where all the processing happens.  This ensures sequential updates of member variables without having to think about locking.  Yeah, it means no performance gains, but really, this app doesn't do much besides catch hardware signals and report them in a UI.

I wouldn't mind some day working on an app that requires threading for performance, that is properly planned out, and that wasn't due before they asked me to work on it. <sigh>
My first suggestion is pretty much identical, except you don't have to create any threads. Just sit and "wait" on all the devices, and when one or more have data, you'll be woken up to handle it all.
Title: Re: Threading is Hurting My Brain
Post by: Thorin on September 15, 2010, 11:59:49 AM
If I stop my main thread to wait for hardware signals, a la your suggested select(), my UI stops responding.  For those in the .NET world, select() is essentially the same as Thread.WaitOne().

So given that select() blocks the thread it's called on for a given amount of time, how do you select() on more than one socket at the same time from only one thread?

I'm willing to bet that you're actually creating a new thread for the socket, then calling select() on that thread while waiting for it to answer.  Especially as this is the standard way of handling multiple sockets in an application.
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 15, 2010, 12:12:22 PM
Nah. If .NET's gui stuff is half as good as Qt you can register as many "Devices" with the main loop as you like and it'll do the waiting for you. At least my game server works that way. I just get signals when theres a new connection or theres data available.
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 15, 2010, 12:21:43 PM
Also, select really isn't like a Thread.waitOne, select can wait on multiple handles, up to 1024 or so on linux, and if the timeout is 0, it just returns asap rather than sleeps. I can't tell if Thread.waitOne is just WaitHandle.waitOne...

So yeah, you can use select in non blocking mode, or fire off a /single/ thread to notify the main thread that things are happening. (not one per device)
Title: Re: Threading is Hurting My Brain
Post by: Lazybones on September 15, 2010, 12:43:07 PM
Rule 1 of threading / ui is that you do your "work" in a different thread than the ui.
Two threads main/ui and a single work thread would be easier to sync than a main/ui plus a thread for every interface.
Title: Re: Threading is Hurting My Brain
Post by: Thorin on September 15, 2010, 02:24:47 PM
I'm breaking Rule 1.  Although none of the work takes more than a few milliseconds in my app.  The only reason there's multiple threads is because whenever an event happens, the event handler is started on its own thread.

From what I saw of select(), if you set the timeout to 0 then when it returns it is no longer waiting for a signal from the hardware.  It might be that what happens is a new thread is spawned that sits and waits for response from the device.
Title: Re: Threading is Hurting My Brain
Post by: Mr. Analog on September 15, 2010, 07:19:11 PM
You're threading hardware calls? Madness!

There are a few things you NEVER want to do in a multi-threaded environment:
1. Database calls
2. File IO
3. Anything that should be transactional

The app I work on is a total mishmash of someone not understanding how threading works in the first place and then trying to solve their problems by nesting thread pools inside thread pools. And the client wants us to "fix" their little transaction "problem".

Before I got my hands on the code there was not a SINGLE LINE in over 500,000 lines of code that EVER prevented thread locking, ANYWHERE.

Anyway, it's a sore spot and I learnt a lot about how to totally abuse multi-threading beyond all belief.

...happy thoughts...
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 16, 2010, 12:06:09 AM
Quote from: Thorin on September 15, 2010, 02:24:47 PM
I'm breaking Rule 1.  Although none of the work takes more than a few milliseconds in my app.  The only reason there's multiple threads is because whenever an event happens, the event handler is started on its own thread.

From what I saw of select(), if you set the timeout to 0 then when it returns it is no longer waiting for a signal from the hardware.  It might be that what happens is a new thread is spawned that sits and waits for response from the device.
The way it works on unix, is it just checks the status flags of the file descriptor (everything is a file of some sort on unix!), and then returns the fd_sets through the three arguments, one each for fds that are ready for reading, writing, and one that have an error condition (ie: it was closed on the other end). With a non zero timeout specified the process will tell the OS to wake it up when any events come in for the given fds or the timeout expires, then go to sleep. No threading involved. At least not in the way I think you're thinking.

Generally devices have hardware interrupts, when data comes in it will fire an interrupt to the operating system causing the driver to wake up and handle the data, which will get fed to the stack above, and should any processes be listening/selecting on a fd thats attached to said device, it will be notified, and woken up if need be.
Title: Re: Threading is Hurting My Brain
Post by: Thorin on September 16, 2010, 10:33:03 AM
Quote from: Mr. Analog on September 15, 2010, 07:19:11 PM
You're threading hardware calls? Madness!

When using the SerialPort class in .NET, you specify a method to handle the DataReceived event.  From MSDN (http://msdn.microsoft.com/en-us/library/system.io.ports.serialport.datareceived.aspx):
Quote
The DataReceived event is raised on a secondary thread when data is received from the SerialPort object. Because this event is raised on a secondary thread, and not the main thread, attempting to modify some elements in the main thread, such as UI elements, could raise a threading exception. If it is necessary to modify elements in the main Form or Control, post change requests back using Invoke, which will do the work on the proper thread.

So I'm doing it specifically as Microsoft says it should be done - all work in the form is being marshalled back to the main UI thread.

Still had to read through that site to make sure there weren't any threading gotchas with Invoke(), though.
Title: Re: Threading is Hurting My Brain
Post by: Mr. Analog on September 16, 2010, 01:55:23 PM
Quote from: Thorin on September 16, 2010, 10:33:03 AM
Quote from: Mr. Analog on September 15, 2010, 07:19:11 PM
You're threading hardware calls? Madness!

When using the SerialPort class in .NET, you specify a method to handle the DataReceived event.  From MSDN (http://msdn.microsoft.com/en-us/library/system.io.ports.serialport.datareceived.aspx):
Quote
The DataReceived event is raised on a secondary thread when data is received from the SerialPort object. Because this event is raised on a secondary thread, and not the main thread, attempting to modify some elements in the main thread, such as UI elements, could raise a threading exception. If it is necessary to modify elements in the main Form or Control, post change requests back using Invoke, which will do the work on the proper thread.

So I'm doing it specifically as Microsoft says it should be done - all work in the form is being marshalled back to the main UI thread.

Still had to read through that site to make sure there weren't any threading gotchas with Invoke(), though.

Granted my experience with hardware/GUI interaction is limited but I think there are some parallels between the way web GUI can interact with a web service call.

If you kick off a new thread every time the user hits the big red button that results in another child process that affects what the big red button does (database update, GUI update, etc) it will lead to trouble when the user starts jamming on it (or if there are communication problems).

Again, for what I do, I exclusively use/create async threaded processes on operations that are only going in one direction, as in, even if they fail we don't care (I call it "fire and forget"). A lot of that is creating requests that end up in queue where the actual processing is handled.

Err anyway, back on topic, I don't care what Microsoft says about safety there, if either one of those thread pools becomes filled or locked (for any reason) the whole transaction is borked and the main worker thread will lock, incoming requests will pile up and ka-blooey!

(err, that is to say, if no more child threads can be created, the current main worker thread will enter wait mode and pass processing to the next main worker thread, if the problem is that there is no more available resources [or whatever] the same thing will happen until you run out of main worker threads in your main thread pool, then ya can say g'night Gracie and hope there's an easy way to kill all the in proc workers with minimal cost to data integrity).

Either that or all this DayQuil has finally gotten to my brain...
Title: Re: Threading is Hurting My Brain
Post by: Thorin on September 16, 2010, 04:06:26 PM
See, that's why I said it's hurting my brain...  Even when you think you have things figured out, there's still that lurking uncertainty.  Threading is so cool yet so hard to do 100% right.  It's almost like real security, in that sense.

Your description of using asynchronous threads makes me think of the discussion with That Guy At Questionmark about CQRS (Command Query Response Segregation (http://www.udidahan.com/2009/12/09/clarified-cqrs/)) where basically there are asynchronous updates to the database that do not need to return any data.  And that hurts my brain, too.
Title: Re: Threading is Hurting My Brain
Post by: Tom on September 16, 2010, 11:02:35 PM
Oy, so MS's api creates and fires off a new thread per event? or just one per device? Personally I'd prefer the latter, but I'd prefer a single thread for all devices, but if MS doesn't "allow" that, well that just sucks.
Title: Re: Threading is Hurting My Brain
Post by: Mr. Analog on September 17, 2010, 11:22:10 AM
Quote from: Thorin on September 16, 2010, 04:06:26 PM
See, that's why I said it's hurting my brain...  Even when you think you have things figured out, there's still that lurking uncertainty.  Threading is so cool yet so hard to do 100% right.  It's almost like real security, in that sense.

Your description of using asynchronous threads makes me think of the discussion with That Guy At Questionmark about CQRS (Command Query Response Segregation (http://www.udidahan.com/2009/12/09/clarified-cqrs/)) where basically there are asynchronous updates to the database that do not need to return any data.  And that hurts my brain, too.

Yep, the main thing I see wrong with most multi-threaded implementations is that the light bulb will go "halfway" on that says "I can run things in parallel" but the other half that says "those things have dependencies that make them synchronous with serious locking ramifications no matter how badly I want them to be otherwise" never lights up.

I could go on and on about the app I work on in very loose terms with great examples of this kind of thinking that got halfway there (but I won't bore/frighten you).

I've actually taken a look back at my own asynchronous multithreaded implementations (specifically AJAX stuff) and seen various flaws and performance bottlenecks that would have been caused from me not considering the full scope of what I was doing.

Part of it is knowing how the tech works, for example:

The .NET Framework has a memory limit of 2 GB (for CLR objects), Web Services are by design multi-threaded, they have a pool of 25 worker threads and will split off each incoming request into it's own process, if you are pushing ~80 MB per request into memory the service can handle 25 simultaneous calls before the .NET Framework will run out of available memory. With all the CLR object memory committed no other .NET code can allocate memory until what's "in use" is deallocated by the garbage collector, but get this, if there are still incoming service requests the thread pool will lock and start committing object data to L2 memory and will never be cleaned up by the garbage collection! You're only option is to recycle ASP.NET manually (YIKES!!).

Now, obviously that's an unlikely example for a single web service, but it's easy to imagine this happening with many web services all consuming/using memory as small as a few kilobytes to several megs on each request and it doesn't take a lot of guessing to know what happens to async processes and impatient users when there is no available memory!

So err, I guess I *did* throw in a generalized example from work, this actually used to happen all the time last year (and still does when things get really busy). For us the primary culprit was a known memory leak in the XML libraries used in .NET 1.1 that started to poison the well at about 6 AM every day causing what I described above to happen artificially BUT it sure made me think about what was really going on in a holistic way.

CRAY-ZEE!

I swear I learn so much from these crippled/broken apps!
Title: Re: new M Night film not a "negative tomatometer rating"
Post by: Tom on September 17, 2010, 11:48:54 AM
Theres a reason my silly little game's server isn't threaded (yet). Right now, if I can get away with it, it'll stick being a single process per node. but the nodes can all talk to each other. And more than one can be running on the same host, and listen on the same port so that gets me multi core support ;D

If I do end up having to add a thread or two to the server, it'll be in the actual game logic handling. the ai or move validation code might start causing too much latency on larger boards, having the ai or validation code in a separate thread would help aliviate that. And I could potentially use "rwlocks" which are fancy locking primitives that let multiple threads read so long as no-one is writing, so only when something goes to modify game state should there be any locking. And since its a turn based board game, in any modern computer's eyes, it won't happen very often per game.

[Mod: Moved post as it was in the wrong topic]