• Members 545 posts
    May 24, 2023, 1:34 p.m.

    Other than the temporal quantization, I don't think anything should be missing from the "Bernoulli trial" approach to synthesize photon noise. Obviously, the simple way I did it can not have two events at "the same time" or during the same discrete time interval (one pixel), but that could be facilitated by doing an extra loop of trials and adding the results within each loop, which could also be achieved by binning and a change in trial threshold. The scarcer the events relative to display pixel size, the less that this quantized version of arrival deviates from analog timing. Anything binned to allow for multiple events per sample is going to blunt the signal of how much the spacing can actually vary.

    If a software function created floating point intervals based on a mean, you still have to choose how to display it (discrete or weighted-analog), and so even if we decide we want to have analog-weighted vertical lines that use variable grey in two adjacent pixel columns, to be virtually analog, the same could be done by just oversampling with the discrete approach by 10-fold, in a way which represents 0.1 pixel precision. The quantized way uses a lot less "trial" iterations, though (random() function calls are very slow), and shows the spacing randomness very well.

    Also, the quantized way has very little error when constructing a synthetic image via intensity-weighted drizzle, because only events that occur near the first or last time sample can be "missed" or "extra" due to quantization, and these are random quantization errors, but our synthetic image is already at the stochastic entropy endpoint before these errors, so nothing should be different than if it were just a different set of samples with the same characteristics.

  • Members 128 posts
    May 24, 2023, 4:15 p.m.

    Sure. Your approach is completely fine with me. I never intended to imply otherwise.

  • Members 128 posts
    May 24, 2023, 9:18 p.m.
  • Members 78 posts
    May 25, 2023, 6:56 a.m.

    Thank you for that JohnV, you seem to be really on top of this stuff so here are a couple of questions from a curious amateur without any formal training in the subject. Apologies in advance if this is completely off to left field.

    I agree that John Sheehy's solution, though technically a Binomial process rather than straight Poisson, is a good approximation for Poisson especially with a good number of events at the low rates we are talking about.

    Speaking about rates I'd prefer something with a low mean, say around 5, so that a histogram will clearly show the Poisson distribution. In his previous example with 500 'tosses' I'd prefer a probability of 1% for instance, the process repeated many times to make it statistically significant.

    Your solution is less intuitive to me. For instance this line:

    last_arrival_time = last_arrival_time + wait;
    

    Aren't arrivals in a Poisson process supposed to be statistically independent?

    Which brings me to another question, isn't time really optional in this discussion? I mean, if the photons are really statistically independent they could have been generated at any time, the process does not need for them to come one after another. In fact they could all have been generated completely randomly, thus out of sequence, sort of like I did with the two-line script earlier. If JohnS's lines represented photons striking silicon during Exposure, we would not expect them to arrive left to right, right?

    Jack

  • Members 128 posts
    May 25, 2023, 4:37 p.m.

    Look - I really don't consider myself an expert here.

    Perhaps I should say a little about my background. I have two undergraduate degrees and an MSc that all have "Mathematics" in their titles. But if you looked at the courses I'd taken for the undergraduate degrees, you'd see a couple of things: I significantly over-qualified for them - by taking more, and more advanced courses than necessary - and that if you listed the courses in a different way, they might look like physics or engineering degrees.

    My (technical) work has been mostly a mixture of low-level software, and hardware design - from designing boards to SoCs, usually involving video input and/or output - with several years working on compilers and hardware design tools.

    This probability and queuing stuff I've used sometimes for simple "Fermi problem" models of (sub-)system performance. It's not stuff I use every day.

    Don't ask me about differential geometry. I did fine on paper but TBH it was a bit "in one ear and out the other".

    Like I said, in these matters, I regard myself as an amateur too. I honestly thought that most of you PS&T folk understood this stuff better than me. I've met some very good mathematically-minded engineers from (and in) California.


    Yup.

    Pass.


    Yes, and they are. I think the difficulty may be related to the "memorylessness" or Markov property which is unique to the exponential (continuous time) and geometric (discrete time) distributions.

    If we look at John Sheehy's code, arrivals are also independent.

    But we can reformulate John Sheehy's code to look like mine, in terms of the (discrete) waiting times for arrivals.

    In John Sheehy's discrete model, suppose we have an arrival at t=n.

    What is the distribution of the size of the gap before the next arrival?

    We could have a zero-length gap: we could have an arrival at t=n+1. This has probability p. (JS has p=0.03).

    We could have a length 1 gap: the next arrival is at at t=n+2. This has probability (1-p)p.

    We could have a length m gap: the next arrival is at at t=n+(m+1). This has probability (1-p)^m p.

    The (discrete) waiting time AKA gap size for an arrival in JS's code has a (zero-based) geometric distribution, with parameter p:

    double p = 0.03;
    int arrival_time[n_arrivals];
    int current_time = 0;
    for( int i = 0; i < n_arrivals; ++i )
    {
       // How many failed trials before next success?
       int wait = floor( log(random(1))  /  log(1.0 - p) );     // (Zero based) Geometrically distributed, parameter p
    
       current_time = current_time + wait;
       arrival_time[i] = current_time;
    
       current_time = current_time + 1;        // Start waiting again at next (discrete) time
    }
    // At this point, the sum of the gap lengths is (current_time - n_arrivals), which has a Negative Binomial Distribution: NB(n_arrivals, p)
    

    If the loop is allowed to run until (current_time >= n), the number of arrivals at (discrete) times 0 <= t < n is binomially distributed: B(n,p).

    The geometric distribution is to the binomial as the exponential distribution is to the Poisson.


    The

    int wait = floor( log(random(1))  /  log(1.0 - p) );     // (Zero based) Geometrically distributed, parameter p
    

    is proved (e.g.) here: home.csulb.edu/~tebert/teaching/lectures/552/variate/variate.pdf#page=4 (The author is using a 1-based geometric distribution, not the 0-based geometric distribution that I'm using).


    Is time optional? That might depend on whether you want to model the shutter opening and closing, or the possibilities of subject movement or change of lighting during exposure. In that case we would have a non-homogeneous Poisson arrival process. Which is not hugely more complicated.

    Outside of photography, JS's sequence of bunched-up lines does a good job of illustrating what a (memoryless) (homogeneous) Poisson process looks like. The probability of arrival per unit time is constant, but the arrivals look bunched up to most people.

    Which causes headlines like "Sudden rise in knife crime!" or "Why are all these planes crashing now?", when this "bunchiness" is often entirely consistent with a constant underlying probability of events per unit time. The big gaps don't make headlines.

    Then sometimes the bunchiness does indicate a problem - for example if a major airplane manufacturer releases a new airliner with defective and undocumented control responses.


    Edit:
    - Fix "gap length": We could have a length m gap: the next arrival is at at t=n+(m+1). This has probability (1-p)^m p.

  • Members 78 posts
    May 26, 2023, 1:21 p.m.

    That explains it :-)

    And that explains that, thank you. I guess they are independent in the same way that the timing of the very first event is random, a bit like resetting the trigger after each event.

    The more I think about this, the more intriguing it gets. I may start a thread on the role of time or lack thereof, shades of QM.

    Jack

  • Members 545 posts
    June 3, 2023, 2:47 a.m.

    Yes, that gives the same mean and distribution as my "Bernoulli trials". It took me a long time to see them give the same results, because I think the equation you give blows up on either a bug or the lower precision of Processing's native math functions. I switched to the Java "Math.___()" functions and used doubles instead of floats, and your equation gave almost the exact same results as the Bernoulli trials.

    The reduced-precision Processing functions were fast, but when p was greater than about 0.1 everything went awry, and when I re-wrote the code today and tried to populate a histogram array, I got an "index out of bounds" message, and adding some debugging code found that the index 2,147,483,647 was coming up, and that explained my initial problem with your equation; I was getting intervals of 2,147,483,647.

    The "Math.___()" functions with doubles are very slow, though, much slower than the Bernoulli trials for short intervals, but for long intervals, where p is very close to zero, there are a lot less calls, so it is faster.

    This is what worked in Processing:

    double p;
    int interval()
    {
      return((int)(1+Math.log(Math.random())/Math.log(1-p)));
    }
    
  • Members 545 posts
    June 3, 2023, 2:08 p.m.

    Are most agreed that the only thing wrong with the Bernoulli trials is the quantization, and the limit to one event per discrete time unit (which your offered equation based on the variate pdf also does)? Someone's "that's not Poisson" statement implied to me that they thought something was wrong (distribution-wise?) with it beyond those two issues (two issues that are easily mitigated into practical irrelevance by oversampling the trials).

    Anyway, thanks for pointing out that alternate method, which I can also use without the quantization, when it is counterproductive. It should be much faster than doing more trials to gain precision, especially with mean intervals that are large. One could make a floating point array and populate it with generated floating point intervals and do as much as possible with the extra precision and only quantize or bin as necessary.

  • Members 128 posts
    June 3, 2023, 2:31 p.m.

    Cool!

    Yes, generating lots of variates from long-tailed distributions can stress random number generation, and expose numerical problems.

    It's important to check exactly what random number generator is being used. These days it should really be at least 64-bit Mersenne Twister. There's a lot of papers out there that are likely bogus because they used whatever inadequate backwards-compatible RNG came with the programming system.

    For numerical reasons, it's important to check the exact specification of functions that return floating-point variates.

    My code was meant to be pseudocode. Illustrative, rather than taking care with numerical corner cases.

    Warning: I don't do Java.

    What I suspect was going wrong was:
    - Whatever random number generator you were using was outputting zero quite frequently.
    - Java Log returns -Inf for a zero argument
    - Log(1-p) evaluates to a finite negative value.
    - IEEE floating-point arithmetic evaluates (-Inf) / (finite negative number) to +Inf
    - Java narrowing primitive conversions evaluate (int) (+Inf) to the maximum positive representable int value, which is 2,147,483,647 = 2^31 - 1

    If you are using a Java random number generator that derives from class java.util.Random, methods nextDouble(), doubles(), nextFloat() produce variates between 0(inclusive) and 1 (exclusive).

    A numerical trick to convert this range to 0 (exclusive) to 1(inclusive) is to do:

    double not_zero = 1.0 - myRNG.nextDouble();
    

    Or:

    float not_zero = 1.0f - myRNG.nextFloat();
    

    This also works:

    float not_zero = (float) ( 1.0 - myRNG.nextDouble() );  // ( 1.0 - myRNG.nextDouble() ) cannot round to zero on conversion to float
    

    That last code should behave better than the previous code, if you are using single-precision logarithms.

    If you are using the default java.util.Random class, try using the derived ThreadLocalRandom, which is supposed to be faster and better.

    There's also Apache Commons Math, which has much better RNG support.


    Edit:

    float not_zero = (float) ( 1.0 - myRNG.nextDouble() );  // ( 1.0 - myRNG.nextDouble() ) cannot round to zero on conversion to float
    

    Is always safe.

  • Members 545 posts
    June 3, 2023, 3:15 p.m.

    There's a lot to digest there about the precisions and functions. I did block zero at one time by doing "random(0.000001,1.0)', which reduced the frequency of problems, but only of one type. I saw two problems, when I checked event rates based on event numbers and total intervals, and sometimes I got a rate that was nearly infinitesimal, and sometimes a greater rate than expected. I'd do multiple iterations and plot points in an x/y scatter, and most of the points were as expected, so no errors during some iterations, but one in every 20 or so would go wild, either close to zero, or exaggerated by up to 2x or more, the exaggeration being higher for higher values of p.

    I am glad I am past the hump, but I'm curious about all that went wrong, so I will try some slower code with checks for min and max, etc, for all variables to see where the error is coming from, by printing out all variables contributing to erroneous intervals.

    The Bernoulli trial method is certainly far more transparent, even if inefficient.

  • Members 83 posts
    June 3, 2023, 3:42 p.m.

    So far as I can remember:
    In floating point computer arithmetic, the closer one gets to zero, the further apart are the numbers than can be represented. The gap between numbers that can be represented will make two numbers that are different have the same representation. This may be avoided by using an interval further from zero and then scaling it back later. en.wikipedia.org/wiki/IEEE_754-1985#Denormalized_numbers

  • Members 128 posts
    June 3, 2023, 7:42 p.m.

    Only for denormal numbers, which are a small part of the range.

    Denormal numbers are a red herring. No need to worry about them here.

    We have a random number generator that spits out (double precision) numbers between 0 (inclusive) and 1 (exclusive).
    We get a random number called r.
    We evaluate (float)( 1.0 - r ).

    The greatest possible value of r is the (double) value preceding 1.

    The binary representation of this double value looks like:

    + 011 1111 1110 (1.) 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111
    [The brackets contain the leading '1' bit, which is not stored]

    Number preceding 1 is (2^-1 [exponent with excess 1023 removed] ) * ( 2^52 [hidden leading digit] + (2^52-1) [stored mantissa bits] ) / 2^52

    = 2^-53 * (2^53-1) = 1 - 2^-53

    So if we evaluate: 1.0 - (double preceding 1)

    We get 2^-53.

    Which is easily representable as a normalised single-precision float.


    Generating double-precision floating-point numbers between 0 (inclusive) and 1 (exclusive):

    Assuming that we have a core random number generator which can conveniently generate strings of 53 random bits, the implementation will very likely treat those 53 bits as a long integer, convert the long integer to double, and then divide by (double) 2^53.

    There other things one could try, but they may have subtle problems with non-uniformity.

    If the base random number generator has quite a small state - say 32 bits - we take the 32-bit state, treat it as an unsigned 32-bit number , convert to double, and then divide by (double) 2^32.


    If you want a production-quality generator of geometrically distributed variables, the way to go may not be via an interface that passes double-precision values.

  • Members 545 posts
    June 5, 2023, 3:35 p.m.

    OK, I wrote simpler code from scratch and this time "random(0.000001,1.0)" did not produce an error with Processing's fast floating point math and functions, so when it didn't work completely before, I was probably doing something else, like perhaps adding 0.5 instead of 1.0 before the integer cast, which would output some integer zeros, which makes sense, because I usually got those extra errors (means that suggested a much higher p value than what I input) when the distribution was over a smaller range, with a higher input for p. Adding 0.5 before going integer is normally ideal, but not when you don't want zeros, so in order to have no zeros, you must accept the intervals being slightly larger than they should be. With a p of 0.001 (where the average interval is 1000 units), that should be insignificant, but when p is 0.5, then the intervals are 0.5 units too large on average when most intervals are less than 3 units. Ideally, we'd work with the floating point intervals, but they are more work to visualize with discrete graphical elements.

  • Members 78 posts
    June 6, 2023, 7:27 a.m.

    I wouldn't worry about zero waits in this context as long as the random number generator does its job competently and it is not reset in between calls. After all time is quantized by definition in your approach, so one could argue that when a zero is generated it is simply an artifact of time not being continuous/granular enough - the probability of getting a collision becomes infinitesimally small with an infinitesimally small time period. This is also an unspoken assumption in my approach and the earlier simulation: that Matlab's Mersenne twister, 64-bit double precision random number generator represents continuous time.

    In the end imho the litmus test for the script is the resulting distribution. After 20,000 or so arrivals are you getting the expected mean and standard deviation within a fraction of one percent - and the relative Poisson mass function? To see the latter clearly I would suggest histograms with bins that result in a small mean, say around 5 per period. If you choose 5.2 you have my distribution pre-computed for comparison in the link just above.

    Jack

  • Members 78 posts
    June 6, 2023, 8:54 a.m.

    That was probably me, agreeing with you that your approach is valid even though technically Binomial rather than straight Poisson. This was my thinking: a number (n) of Bernoulli trials with probability of success (p), otherwise (1-p), is a Bernoulli process, which results in a Binomial distribution; a Binomial distribution as the number of trials approaches infinity and np approaches a finite limit tends to Poisson, with the mean rate equal to np. So according to Wikipedia a rule of thumb says that if n >100 and np<10 you are good to go.

    Jack

  • Members 128 posts
    June 6, 2023, 9:53 a.m.

    John Sheehy's approach does generate independent binomial approximations to Poisson.

    Jack's code doesn't generate independent, identically distributed, variates:

    If we choose a particular slot - say slot 0 - before looking at any of the other slots, the distribution of the histogram count for that slot is B(104000,1/20000) ~ Poisson(5.2) , which, I guess, is what you are trying to achieve.

    If we get a count of m in slot 0, that leaves 104000-m counts to put in the other slots.

    So if we now choose a second slot - say slot 1 - the distribution of the histogram count for that slot is B(104000-m,1/19999)

    After we've looked at 19999 slots, we don't need to look at the 20000th slot to see what the count is there - we can just add up the counts for the 19999 slots that we've already looked at, and take the sum away from 104000.

    By fixing the total number of raindrops, the number of raindrops in each bucket becomes dependent on the number of raindrops in the other buckets.

  • Members 78 posts
    June 6, 2023, 12:38 p.m.

    Well, this is where all those degrees with 'mathematics' in them come in handy, so here are more questions. The first Matlab line above generates 104k 64-bit floating point random values between zero and one (then scaled to match the desired experimental range). The interval between zero and one can represent the normalized continuous-time duration of the experiment for a single detector; or the normalized continuous-width of the sensor during a single exposure. Therefore for our purposes the generated random values are specific points in continuous time or space. Aren't these 104k values considered independent, identically distributed variates, assuming a well behaved random number generator?

    The second line above simply divides the time/space into a discrete number of contiguous intervals and counts the number of events that happen to fall within each.

    Jack

  • Members 78 posts
    June 6, 2023, 12:52 p.m.

    Fixing the total number of arrivals is indeed a limitation of my approach (shades of the earlier Expected Value vs Mean discussion), which however I believe becomes irrelevant with a large enough number of intervals (the dual of JohnS's n and np rule of thumb). What if I told you that I started with a known constant rate of 5.200 arrivals and that I wanted to check 20k such intervals (say a uniformly illuminated 200x100 pixel patch captured in the raw data)?