Common iGaging EZ-View DRO Problems

Sunday, May 9, 2021

One of the most common problems that TouchDRO users report is jumping of the position redout. There are a few different flavors of this problem, but the common factor is that the position jumps up or down by some amount and then returns to the normal reading. Most common are frequent small jumps by one encoder count; in some rare cases, the scale might rapidly jump up and down by a few encoder counts as well. Much less common are cases where the position momentarily jumps by a random large amount and then returns to the correct reading. These particular problems affect almost exclusively only the iGaging EZ-View and DigiMag "Remote DRO" scales [and their rebranded counterparts]. Since the iGaging display masks this behavior, it leads many TouchDRO users to believe that there is something wrong with their setup.

Note that capacitive scales with remote displays are more sensitive to electrical noise than glass and magnetic DRO scales. If you are trying to resolve flaky readout issues that are different from the one described in this post, take a look at the TouchDRO troubleshooting guide.

Principle of Operation

First, let's do a quick recap of the principle of operation of the capacitive DRO scales, and iGaging EZ-View in particular. This will help us better understand where potential issues can arise and identify the common causes..

Capacitive scales, as the name suggests, measure the change in capacitance and convert it to a position reading. They consist of an encoder strip attached to the scale's frame that has a series of equally spaced pads; in the movable head, there is a small PCB that has a set of pads with slightly different spacing. These pads form a set of very low value capacitors. The microcontroller can very precisely measure the capacitance by measuring the time it takes the capacitors to fully charge or discharge. As the scale moves, the relative overlap [and the capacitance] between the pads changes in a particular way that can be plotted as two sinusoidal lines that are offset by 90 degrees [so the graphs are in fact Sine and Cosine). The scale then uses one of the Sine/Cosine interpolation algorithms to very precisely derive its travel distance. Note that the values being measured here are very small, and not all interpolation algorithms have equal noise rejection performance. As a result, stray capacitive coupling and fluctuation in supply voltage can throw off the reading.

Once the position is derived, the scale has to display it to the end user, but since these particular scales use a remote display, the position has to be sent to it over a wire. For historical reasons, these scales keep track of the travel distance and send it to the display or TouchDRO adapter using a synchronous serial protocol that sends the position as a set of high and low pulses using the two wires (Clock and Data). When the clock signal goes from 0V to 3V (positive edge), the scale sets the data line to the next bit, and when the clock goes high-to-low (negative edge) the processor reads the value. The screen capture below shows a yellow clock line and blue data line.

Oscilloscope Capture of iGaging 21-bit Data Stream

One interesting quirk of iGaging scales is the fact that the display provides the clock, whereas in all other scales the reading head provides the clock and the data signals.

Root Causes of The Jumping

As I mentioned earlier, there are two different flavors of this problem. Since jumps by 0.0004" are by far more common, let's look at it first.

Readout Jumps by 0.0004"

To understand the nurture of this problem, imagine that you were given a ruler with 1" graduations and asked to measure some object that is 3.5" in size and round the size to the nearest inch. Since there is no 0.5" graduation, you would have to use your judgement to determine if the measurement is closer to 3" or 4". As you look at the object, the reading looks closer 3" but then the lighting changes and now it looks closer to 4". This is exactly what the encoder has to do, except it's graduations are about 0.97 of a micron (or about 0.00039") in size. In other words, as the scale is moved, it's internal processor has to convert an infinite number of analog values to a finite set of discrete steps. When the scale is moved to a position that is half-way between two "ticks", the encoder has to round the reading to the nearest one. Since the capacitance values are very small, even a miniscule amount of noise in the power supply or stray capacitive coupling with your body or other objects in the environment can nudge the value enough to jump to the other side of the [virtual] graduation.

The least significant bit is flapping between 1 and 0

This affects virtually all iGaging EZ-View scales to some extent, but shop noise can greatly exaggerate the issue. For example, starting and stopping the motor can alter the electromagnetic field around the machine, which will affect the capacitance reading. Even as you move around the capacitance between your body and the machine can make a big enough impact to nudge the reading up or down. The end result can be seen in the GIF image above. It consists of about 30 consecutive oscilloscope screen captures while the scale was stationary. iGaging EZ-View DRO scales have resolution of 2560 positions per inch, or about 0.97 micron per step. As a result, as the last bit flips between "high" and "low", the TouchDRO display will go up and down by 0.0004" or 0.01mm.

Momentary Large Jumps

From time to time the readout can momentarily jump up or down by a large distance and then immediately return to the correct reading. In most cases, this happens because the scale completely misses a clock cycle and doesn't set the next bit to the correct value. As a result, the display or the TouchDRO adapter receives the same bit twice and the rest of the stream is offset by one position. For instance, consider a steam that contains the following data: 000111001001110100000 which is equivalent to 234400 in decimal. If the scale misses the tenth clock cycle, the display will read the ninth bit twice and miss the very last bit. The stream will be 000111001100111010000, or 235984 in decimal. That is a difference of 1584, or 0.6188". Almost all iGaging EZ-View scales do this from time to time. Usually, this happens very rarely and most users never notice it, but some older scales (circa 2012-2015) were more susceptible to it. It's hard to tell what causes this behavior, but a likely reason is that the microcontroller is doing some critical work and misses [or disables] the interrupt.

In some cases, very similar behavior can be caused by spurious noise spikes and glitches in the clock line. If a spike is high enough, it can be mistaken for a clock pulse. Due to the way TouchDRO reads these scales, this is usually not a problem. The adapter generates clock signals independently of the reading logic. When a glitch is high enough, it's very likely to be received by the adapter and the scale, and will thus be handled correctly. In some rare cases, the amplitude of a spike can be such that either the adapter or the scale will not detect it. In this case the behavior will be very similar to the one described in the previous paragraph. This can also happen in the data line, but the spike would need to happen precisely at the clock negative edge, and that is statistically much less probable.

The scenarios above can manifest themselves as a climbing position with some early DigiMag scales. IGaging DigiMag and EZ-View scales expect 21 clock cycles. When a scale misses a clock pulse or interprets a spurious glitch as a legitimate signal, it will get either 20, or 22 pulses instead. Early scale versions tended to get permanently thrown off by this. As a result, there would be a permanent error between the scale and the TouchDRO adapter. Newer scale revisions seem to reset their pulse counter after a few milliseconds, so they recover from this after a few data streams.

Readout Jumps by Several Thousandths

The behavior is very similar to the first issues, but the distance is larger than a single encoder unit. I've encountered only a handful of scales that had this problem, so it's hard to tell what the root cause is. In every single case, a few bits at the end of the stream (least significant bits) were flapping between all 1's and all 0's. The animated GIF below is representative of this behavior. Notably, the number of flapping bits seems to be constant for a given scale. In other words, some scales had three flapping bits, regardless of the position, while others had two.

This is an example of a rare case when more than one bit is flapping

It doesn't seem to be caused by noise or incorrect capacity measurements; noise or stray capacitance would likely result in more random flickering. Furthermore, the behavior didn't seem to change even when the scale was placed into a Faraday cage (and was thus shielded from environmental noise).

Why Doesn’t This Happen with iGaging Display?

A common complaint is that the issues described above never happened with the native display, but keep happening with TouchDRO. The reason is that the native iGaging display is using various software tricks to hide those issues from the end user. While I don't have access to their source code, I can presume [based on my observations] that the display does a few different things besides simple averaging of multiple readings. The reason I suspect this is that I've observed cases where the oscilloscope showed some serious jumping, while the display stays completely still.

For instance, when the above capture was taken, the reading on the iGaging EZ-View display was completely stationary, without any hints of flicker. All the while, the value in the data stream was jumping between 1743 and 1736 (0.0027"/.07mm). In other words, the display completely masked this issue.

Why Can't TouchDRO Do This?

Over time, I've added many small tweaks to work around various scale idiosyncrasies. In most cases, those are limited to optimizing the reading speed, oversampling the signal in order to filter out spurious noise and other similar things. There are further improvements I can make thanks to the extra processing power that the new ESP32 architecture offers. That said, there is only so much that can be done in the software when a scale is sending bad data.

For instance, there are relatively simple software tricks that can make the display appear stable, but they inevitably reduce the resolution and responsiveness of the system. For instance, this sort of back and forth flickering can be relatively easily reduced [but not completely eliminated] by averaging a large number of readings. The drawback to that would be a serious refresh lag that would make the DRO very hard to use. Another approach would be to average only a few values, and when this sort of jumping is detected, latch in the last average until the scale moves. This would completely fix the jumping at the cost of resolution and repeatability.

A DRO is a precision measurement device. Adding any sort of logic to the firmware that would make up things in order for the display to look stable would make it very difficult to reason about the accuracy of the measurements. Consequently, I've made a conscious decision not to implement any filtering or normalizations that would introduce any lag or accumulated error to the system. In my opinion, flickering digits or momentary readout jumps are a lot less harmful than accumulation of error that can lead you to scraping a workpiece.

Conclusion

I hope this has shed some light on the common readout stability issues that can affect iGaging EZ-View and DigiMag scales. Random position jumps and flickering of the last digit affect almost all of these scales to some extent. Although iGaging displays can mask these issues, I've made a conscious decision to prioritise responsiveness and accuracy over the display stability. That said, ESP32 has enough available processing power to do some digital post processing that will reduce the visual impact of those issues. The results won't be as clean as the "brute force" approach, but I think it's better to know that you have a problem than to ruin your work due to accumulated error.

No comments :

Post a Comment