Good morning/evening gentlemen,
I have a CFTBL, a PIN2DMD (replacing the original Plasma), and a v3.0 board (assembled by me, but components provided by Steve).
As it turns out, I stated above that I experienced glitches (music not playing sometimes, wrong callout being used, etc), which disappeared after I did a resolder. I assumed that I had a trace connection problem somewhere, story closed.
However the issue reappeared. Relatively often.
Now an earlier comment from Steve stated that clock speed was likely not an issue, as bus interface was relatively low compared to acquisition speed of the RPI.
In the case of bus acquisition, all data provided by a few users on this thread indicate there is a reasonably wide variety of HW configurations (different boards, different RPI, different pinball machines), and all exhibit failures in some form. True, we don't know whether this is widespread to other users, but if I take a look at what I experienced, I would probably not have raised the issue since it was random, unless I had been reading this thread.
All these failures lead to data acquisition failure, which could be caused by:
- Signal integrity (only way to know for sure is to hook up an oscilloscope at the input of the buffers - did not check what was used), not really practical for most people, and given the design very unlikely (there is no Y connection, it's a straight end to end with impedance matching from what I could see), unless there's noise on the power supply lines (which would be again unlikely since power for Tiltaudio is provided directly through a rectifier bridge taken from 20VAC, and it has it's own power rails)
- the RPI missing it's time slice for acquisition, or data input is garbled
Disclaimer: what follows is pure speculation on my side, however it is based on professional experience on the usage of SoC type devices for similar type of application (ie time sensitive).
I have a strong suspicion this is related to the notion of "Real Time OS", as in "Hard real time". This is not about speed (well sort of), it's about scheduling. With an RT-OS, scheduling can be set in such a way that no matter what, a specific thread/process will have it's allocated timeslot, whatever the timeslot is (within reason of course). Linux is not an RT OS, so even if you set priority of a specific thread to high, there is a chance that another thread will stall the scheduling, thus creating the critical thread to miss it's timeslot.
Also, if you have multiple inputs and read them into independent threads, there are chances that the aggregation of these inputs could be offset (so for example in the case of GPIO reading, you read GPIO1, then GPIO2, etc, then combine them into a single byte, but the timeslot of GPIO2 might be offset relative to GPIO1, resulting into a byte failure).
More information here:
https://www.guru99.com/real-time-operating-system.html
A real RT-OS is pretty hard to deal with for complex tasks, so the Linux kernel group has developed a set of "semi RT" patches, known as "preempt RT patches", which essentially are scheduler patches to provide some real time capabilities, within limits.
Details here:
https://rt.wiki.kernel.org/index.php/CONFIG_PREEMPT_RT_Patch
and a link to a RPI Preempt RT patch documentation (rather old though):
https://lemariva.com/blog/2018/02/raspberry-pi-rt-preempt-tutorial-for-kernel-4-14-y
I don't know if the base BSP (OS) used by Steve for TiltAudio has those patches installed, but if it does not, adding these patches to the kernel could potentially increase reliability of the bus acquisition. It would require however some engineering investment, since you would need to recompile the kernel with the patches (should be straightforward), and probably implement change in the thread management of the application to define those that are critical to those that are not (probably less straightforward - although I can't comment, while I know the principle of operation I'm not a SW developer).
But IMO it's worth a shot.
I've had similar issues on ARM57/ARM53 type SoC, which when used in time critical environnement failed precisely because of that reason. In my case, as the constraints were severe (aero), they had to go the "hard" way by using a specific Linux BSP completely designed for Real time (for those interested, called Redhawk from Concurrent Technologies)
Note: increasing clock speed and CPU efficiency reduces the likelihood of this type of occurence, simply because everything runs faster so the chances of a thread blocking another thread to execute are smaller, however from the architecture standpoint it is not guaranteed. So replacing with a RPI4 might help in this case (coincidentally, this is also recommended by Steve for Bluetooth operation anyway)
Another alternative would be to look at CPU affinity, essentially locking all non critical processes to "general" cores, and lock the acquisition threads (or any time critical process) to the last available core, dedicated sorely for that purpose. This will however not necessarily improve if the thread stalling is happening because of internal bus contention within the SoC (in which case even if you have a CPU core completely available, if the GPIO block is used by another thread on another core, it could block access), but it's also worth a shot
So a path of investigations would be:
- Use a RPI4 just to decrease contention ratio: easily achievable by every user, at least for testing
- Implement CPU affinity (some details here: https://www.xmodulo.com/run-program-process-specific-cpu-cores-linux.html ). Probably need a little bit of guidance from Steve to identify the processes that we would want to pin to a specific CPU (and use all others for the other processes).
Some details here: https://www.xmodulo.com/run-program-process-specific-cpu-cores-linux.html
- Implement Preempt RT patches if they are not already enabled, and if they are check how the SW is defining priority for the data acquisition thread. Significant investment on Steve side, so to use only as a last resort (and assuming this issue is more widespread than originally thought)
Apologies if what I'm writing above has already been considered and implemented, in which case this idea is completely moot.
On my side anyway I wanted to upgrade to a RPI4, at least for Bluetooth range, so I'll test it anyway (when I receive my RPI4)
Regards