Sunday, June 16, 2019

Graphing frequencies versus distance

If we take the first 10k prime last digits we see that pairs don't occur with the same frequencies as with random 1,3,7,9s. The distances for random numbers also average put about four digits between any pair of 1,3,7,9 but for prime last digits the distances are all different.




Wednesday, June 12, 2019

Frequencies versus distances.

We might expect strong relationship between eg average distance between 1 and  7 (in that order) and the number of times 1,7 (in that order) occurs in last digit tables.

look for pair frequency
-- 351 795 860 380
-- 508 307 716 870
-- 629 696 311 775
-- 899 603 523 365
NR =  9592
Sums are  2386 2401 2411 2390
Proportions for 1,x are: 0.147108 0.333194 0.360436 0.159262
Proportions for 3,x are: 0.211579 0.127863 0.298209 0.362349
Proportions for 7,x are: 0.260888 0.288677 0.128992 0.321443
Proportions for 9,x are: 0.376151 0.252301 0.218828 0.15272
<<< Process finished (PID=18032). (Exit code 0)
================ READY ================

We could start by looking at tables above.

Distances for pairs from random generation of 1,3,7,9.


Wanted to confirm nice, symmetric frequencies for random 1,3,7,9 digits. So, first created a 10k file of random 0,1,2,3. Then changed all the 0s to 7s and all the 2s to 9s to give file  called  randoms10k1379A.txt. {
https://drive.google.com/open?id=1kAbHBenOM6mSXECkgvOiqI9Ftfthrwny
https://pastebin.com/akpWhRdE
}. Then I made a TinyC program, primDist4E.c. { https://pastebin.com/archive/text } to report on  distances between any two different 1,3,7,9 digits in the list. It would output, for instance,  all the 7,x distances in file called distFile7.txt.  Here's an example of that file using just 100 prime digits.

1 2 3 4 To be read in conjunction with primdist4.c
4 2 5 1
1 2 3 5
3 1 4 2
1 2 6 3
1 2 3 4
4 2 3 1
1 7 2 3
3 5 4 1
5 1 2 4
3 1 6 2
2 3 4 1
4 2 6 1
3 1 4 2
1 2 4 11
1 2 3 7
1 5 2 4
11 3 1 2
10 2 4 1
6 1 5 2
1 6 10 2
1 2 3 4
3 5 2 1
1 3 0 2
First line goes 1 2 3 4 which means the first 7 is distance 1 away from 1, 2 away from 2, 3 away from 3 and 4 away from 4. Symetry is just a conincidence.
Note lines 18 and 19 indicate that 7 is a long way away from the first 1 after that.
Should have no zeros except in last lines where list runs out before 1,3,7,9 all found after anchor 7. What about the sum of the rows. Smallest must be like line 1 viz 1+2+3+4 = 10. Corresponds
With consecutive last prime digits of 7 1 3 7 9 (ie ..07 11 13 17 19 ...) in prime number list.

Gawk file below is doStatsOnDistFiles0.gawk  
https://pastebin.com/uFiABNwV

NPP_SAVE: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk
CD: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
Current directory: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
INPUTBOX: "enter file eg distFile1.txt"
local $(INPUT) = distFile1.txt
local $(INPUT[1]) = distFile1.txt
C:\Users\peterb\Desktop\NewSoftware\GnuWin\GetGnuWin32\gnuwin32\bin\gawk -f C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk distFile1.txt
Process started (PID=10148) >>>
Here comes the stats from doStatsOnDistFiles0.gawk
11093 10652 10775 11115
Averages below:
Number of records is 2732
4.0604  is mean of column 1
3.89898  is mean of column 2
3.944  is mean of column 3
4.06845  is mean of column 4
<<< Process finished (PID=10148). (Exit code 0)
================ READY ================

================ READY ================
NPP_SAVE: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk
CD: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
Current directory: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
INPUTBOX: "enter file eg distFile1.txt"
local $(INPUT) = distFile3.txt
local $(INPUT[1]) = distFile3.txt
C:\Users\peterb\Desktop\NewSoftware\GnuWin\GetGnuWin32\gnuwin32\bin\gawk -f C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk distFile3.txt
Process started (PID=6708) >>>
Here comes the stats from doStatsOnDistFiles0.gawk
11344 11090 11253 11104
Averages below:
Number of records is 2806
4.04277  is mean of column 1
3.95225  is mean of column 2
4.01033  is mean of column 3
3.95723  is mean of column 4
<<< Process finished (PID=6708). (Exit code 0)
================ READY ================
NPP_SAVE: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk
CD: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
Current directory: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
INPUTBOX: "enter file eg distFile1.txt"
local $(INPUT) = distFile7.txt
local $(INPUT[1]) = distFile7.txt
C:\Users\peterb\Desktop\NewSoftware\GnuWin\GetGnuWin32\gnuwin32\bin\gawk -f C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk distFile7.txt
Process started (PID=8132) >>>
Here comes the stats from doStatsOnDistFiles0.gawk
11943 10945 11083 11247
Averages below:
Number of records is 2798
4.26841  is mean of column 1
3.91172  is mean of column 2
3.96104  is mean of column 3
4.01966  is mean of column 4
<<< Process finished (PID=8132). (Exit code 0)
================ READY ================
NPP_SAVE: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk
CD: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
Current directory: C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0
INPUTBOX: "enter file eg distFile1.txt"
local $(INPUT) = distFile9.txt
local $(INPUT[1]) = distFile9.txt
C:\Users\peterb\Desktop\NewSoftware\GnuWin\GetGnuWin32\gnuwin32\bin\gawk -f C:\Users\peterb\Desktop\NewSoftware\TinyC\tcc-0.9.27-win32-bin\tcc\Primes0\doStatsOnDistFiles0.gawk distFile9.txt
Process started (PID=7832) >>>
Here comes the stats from doStatsOnDistFiles0.gawk
11161 10837 11074 11094
Averages below:
Number of records is 2757
4.04824  is mean of column 1
3.93072  is mean of column 2
4.01668  is mean of column 3
4.02394  is mean of column 4
<<< Process finished (PID=7832). (Exit code 0)
================ READY ================

The red entry above is the output when doStatsOnDistFiles0.gawk works on distFile9.txt with respect to column 3, ie the distances between 9 and 7.

Note in above all the distances 1,1 and 1,3 and 1,7 .... and 9,9 are all close to 4.0. Not sure why. Expect symmetry but can't calculate expectation.



Prime Distances 2

I took the first 10K (more or less) primes and made a file of the last digits. {
 https://drive.google.com/open?id=1iVTHNs9oeX5LPYQr0VgiaUwUOoXUuHBb
(lastdigit0.txt in google drive) }

Then looked at pairs of digits and counted frequencies. Used gawk script called pairs.txt below.

Below is summary:

Summary of pair frequencies of first 10K (more or less) last digit prime numbers. 

#file below is gawk program called pairs1.txt
BEGIN{ print "look for pair frequency"
previous=1
ar[3,5]=8;
}
{ current = $1
ar[previous,current]++;
#print previous, $1, ar[1,3];  
 previous = current;
 }
 END {
 for(i=0;i<10;i++) {
for(j=0;j<10;j++) {
#print i,j,ar[i,j];
}
 }
 print "--", ar[1,1],ar[1,3],ar[1,7],ar[1,9];
 print "--", ar[3,1],ar[3,3],ar[3,7],ar[3,9];
 print "--", ar[7,1],ar[7,3],ar[7,7],ar[7,9];
 print "--", ar[9,1],ar[9,3],ar[9,7],ar[9,9];
 print "NR = ", NR
  s[1] =ar[1,1]+ar[1,3]+ar[1,7]+ar[1,9];
  s[3] =ar[3,1]+ar[3,3]+ar[3,7]+ar[3,9];
  s[7] =ar[7,1]+ar[7,3]+ar[7,7]+ar[7,9];
  s[9] =ar[9,1]+ar[9,3]+ar[9,7]+ar[9,9];
  print "Sums are ", s[1],s[3],s[7],s[9];
  print "Proportions for 1,x are: " ar[1,1]/s[1],ar[1,3]/s[1], ar[1,7]/s[1],ar[1,9]/s[1]
   print "Proportions for 3,x are: " ar[3,1]/s[3],ar[3,3]/s[3], ar[3,7]/s[3],ar[3,9]/s[3]
    print "Proportions for 7,x are: " ar[7,1]/s[7],ar[7,3]/s[7], ar[7,7]/s[7],ar[7,9]/s[7]
print "Proportions for 9,x are: " ar[9,1]/s[9],ar[9,3]/s[9], ar[9,7]/s[9],ar[9,9]/s[9]
}
---------------output--------------
NPP_SAVE: C:\Users\Dell\Documents\Primes\Distances0\gawkStuff\pairs1.txt
CD: C:\Users\Dell\Documents\Primes\Distances0\gawkStuff
Current directory: C:\Users\Dell\Documents\Primes\Distances0\gawkStuff
INPUTBOX: "Script arguments : "
local $(INPUT) = C:\Users\Dell\Documents\Primes\PrimeLists\lastdigit0.txt
local $(INPUT[1]) = C:\Users\Dell\Documents\Primes\PrimeLists\lastdigit0.txt
Script input arguments, @ARGV : C:\Users\Dell\Documents\Primes\PrimeLists\lastdigit0.txt
"C:\Users\Dell\Desktop\Setup Files\GnuWin\GetGnuWin32\gnuwin32\bin\gawk.exe"    -f  C:\Users\Dell\Documents\Primes\Distances0\gawkStuff\pairs1.txt C:\Users\Dell\Documents\Primes\PrimeLists\lastdigit0.txt
Process started (PID=18032) >>>
look for pair frequency
-- 351 795 860 380
-- 508 307 716 870
-- 629 696 311 775
-- 899 603 523 365
NR =  9592
Sums are  2386 2401 2411 2390
Proportions for 1,x are: 0.147108 0.333194 0.360436 0.159262
Proportions for 3,x are: 0.211579 0.127863 0.298209 0.362349
Proportions for 7,x are: 0.260888 0.288677 0.128992 0.321443
Proportions for 9,x are: 0.376151 0.252301 0.218828 0.15272
<<< Process finished (PID=18032). (Exit code 0)
================ READY ================
Looking at the big red number above, it tells us that 3,7 pair turns up 29.8209% of the total pairs of the form 3,x.  We would expect these numbers to all be 0.25 plus or minus a little bit.
Note the lowest number in each row is 1,1 or 3,3 or 7,7 or 9,9. Conclusion there's a low probability for repetition.

Saturday, April 27, 2019

Prime distances 1

To do: Get averages, means from 1 to 1, 1 to 3 .... 9 to 9 by looking at entries in distFileN.txt.
How to read big list of last disgits of prime numbers into array ar[] for use below.
How to get modes, medians, standard deviations etc.
How to read distFileN.txt in Excel to get stants and graphs.


This project is stored in online gdb here . It creates four new text files, distFile1.txt, distFile3.txt, distFile7.txt, distFile9.txt  that record distances from eg 1 to 1, 1 to 3, 1 to 7 and 1 to 9. Suspect some distances will be unexpected if just dealing with random numbers.

//PrimeDistances7a Looks at distance from each 1,3,7,9 (last digits of primes in list) to each of the 4 possibilities.
// eg given int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 } and looking at first 1 we see that the four distances are
// 4,1,2,3 from 1,3,7,9 respectively. Want to work out averages for each pair 1 to 1, 1 to 3 ....9 to 9. Should be more interesting than
// just random 1,3,7,9s...

#include <stdio.h>
int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 };

int distances[10];
void showDistances(void);
void match1(int anchor);
void houseKeeping(void);
void windUp(void);
int size;
int newDistCounter = 0;
FILE *fptr1,*f1,*f3,*f7,*f9;

int main () {

  houseKeeping();
  size = sizeof (ar) / sizeof (ar[0]);

  for (int i =0;i<size-3;i++) {
      match1(i);
      showDistances();
  }
  return 0;
  windUp();
}


void match1(int anchor) {
    for(int l =0;l<10;l++) distances[l]=0;
    int i = anchor;
    int j = 1;
    newDistCounter=0;
    while ((i + j < size) && (newDistCounter<4) ){
        if (distances[ar[i + j]] == 0) {
            distances[ar[i + j]] = j;
            newDistCounter++;
        }
        j++;
    }
    writeDistances(ar[anchor]);
}

void showDistances(void) {
  for (int k = 0; k < 10; k++) {
      printf ("%d ", distances[k]); //init the places where distances will be recorded
  }
  printf(".....\n");
}
 
void writeDistances(int anch) {
    switch (anch) {
        case 1:
            fprintf(f1,"%d %d %d %d\n",distances[1],distances[3],distances[7],distances[9]);
            break;
        case 3:
            fprintf(f3,"%d %d %d %d\n",distances[1],distances[3],distances[7],distances[9]);
            break;
        case 7:
            fprintf(f7,"%d %d %d %d\n",distances[1],distances[3],distances[7],distances[9]);
            break;
        case 9:
            fprintf(f9,"%d %d %d %d\n",distances[1],distances[3],distances[7],distances[9]);
            break;
        default:
            printf("Not a prime\n");
    }
}

void houseKeeping(void) {
    int status;
    status = remove("distFile1.txt");
    f1 = fopen("distFile1.txt","a");
    status = remove("distFile3.txt");
    f3 = fopen("distFile3.txt","a");
    status = remove("distFile7.txt");
    f7 = fopen("distFile7.txt","a");
    status = remove("distFile9.txt");
    f9 = fopen("distFile9.txt","a");
}
void windUp(void) {
    fclose(f1);
    fclose(f3);
    fclose(f7);
    fclose(f9);
}
....................................
Here's what distFile1.txt will look like:
4 1 2 3
1 2 4 6
0 1 3 5

ie, top line above 4 1 2 3:
distance from 1 to 1 is 4 steps  (look at int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 };
distance from 1 to 3 is 1 step,  look at int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 };
distance from 1 to 7 is 2 steps, look at int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 };
distance from 1 to 9 is 3 steps, look at int ar[] = { 1, 3, 7, 9, 1, 1, 3, 3, 7, 7, 9, 9 };


Thursday, January 31, 2019

Prime dots

I wanted to show that the last digits in prime numbers did not occur randomly, rather the 1,3,7,9 that they all end with occur in patterns. So I found some random number lists and played with them using gawk and c until I had a nice list of the last digits of all the primes up to one million. There are about 79,000 of them. This is the file that's opened in the following line:
ptr_file =fopen("upTo1GlastNum3.txt","r");

I gave each of 1,3,7,9 a dot-colour and put them in a 79k 256-colour BMP using great library I found called EasyBMP.

To compare the dot picture I did the same for a 640 x 480 BMP of random 1,3,7,9 digits generated bu normal random number function in C.

Here is the the picture that's generated from actual prime numbers.:


And here's the one generated by rand() in C.


There's more in the lower picture because I kept going til full 640 x 480 BMP was filled. That is 307K dots whereas the number of dots generated from real primes came from a list of only 78k digits = dots.
Upshot. They look very similar. Not sure this BMP method is a good way to generate non-randomness of prime last digits.

The program I used follows.

//BMPDots7 makes a BMP of last digit of all randoms up to 1M. Plus does similar to program generated random numbers' last digit.
// Gives two BMPs. One for last digit of prmes, other for random 1,3,7,9. Hard to tell the difference.
//To do: Create file of 1,3,7,9 from rand(). Read file in a place relevant dot on BMP.

#include "EasyBMP.h"
#include<stdlib.h>
#include<conio.h>
using namespace std;
int main( int argc, char* argv[] )
{
    
BMP PrimePic0;
PrimePic0.SetSize(640,480);
PrimePic0.SetBitDepth(8);

int xxcount3 = 0; 
int ones2=0;
int threes2=0; int sevens2=0; int nines2=0; long nonPrimes2 = 0; long allLines=0;
FILE *ptr_file; 
char buf[1000];
ptr_file =fopen("upTo1GlastNum3.txt","r");
int h=0; int k=0;
while ((fgets(buf,1000, ptr_file)!=NULL) && (k<480)){
allLines++;
switch (buf[0]) {

case '1':
PrimePic0(h,k)->Red = 255; 
PrimePic0(h,k)->Green = 255;
PrimePic0(h,k)->Blue = 0;
ones2++;
break;
case '3':
PrimePic0(h,k)->Red = 255; 
PrimePic0(h,k)->Green = 0;
PrimePic0(h,k)->Blue = 0;
threes2++;
break;
case '7':
PrimePic0(h,k)->Red = 0; 
PrimePic0(h,k)->Green = 255;
PrimePic0(h,k)->Blue = 0;
sevens2++;
break;
case '9':
PrimePic0(h,k)->Red = 0; 
PrimePic0(h,k)->Green = 0;
PrimePic0(h,k)->Blue = 255;
nines2++;
break;
default:
cout << "Not prime" << allLines <<endl;
nonPrimes2++;
printf("Non prime");
}
h++;
if (h==640) {
h=0;
k++;
}
}
fclose(ptr_file);
PrimePic0.WriteToFile("prime0Pic.bmp");
//Now do case where 1,3,7,9 are generated by random numbers via c-rand().
BMP AnImage;
AnImage.SetSize(640,480);
AnImage.SetBitDepth(8);
cout << "File info:" << endl; 
cout << AnImage.TellWidth() << " x " << AnImage.TellHeight() << " at " << AnImage.TellBitDepth() << " bpp" << endl;
cout << "colors: " << AnImage.TellNumberOfColors() << endl;
cout << "(" << (int) AnImage(14,18)->Red << "," << (int) AnImage(14,18)->Green << "," << (int) AnImage(14,18)->Blue << "," << (int) AnImage(14,18)->Alpha << ")" << endl; 


int randDigit,i,j,ones,threes,sevens,nines;
long nonPrimes=0;
long allNums = 0;
j=0; ones=threes=sevens=nines=0; i=0;

while (j<480) 
{
randDigit = rand()%10;
allNums++;
switch (randDigit)
{
case 1:
AnImage(i,j)->Red = 255; 
AnImage(i,j)->Green = 255;
AnImage(i,j)->Blue = 0;
ones++;
break;
case 3:
AnImage(i,j)->Red = 255; 
AnImage(i,j)->Green = 0;
AnImage(i,j)->Blue = 0;
threes++;
break;
case 7:
AnImage(i,j)->Red = 0; 
AnImage(i,j)->Green = 255;
AnImage(i,j)->Blue = 0;
sevens++;
break;
case 9:
AnImage(i,j)->Red = 0; 
AnImage(i,j)->Green = 0;
AnImage(i,j)->Blue = 255;
nines++;
break; 
default:
nonPrimes++;
i--;
}
i++;
if (i==640) {
i=0;
j++;
}
}
AnImage.WriteToFile("copied1dots3.bmp");
cout << "Non primes= " << nonPrimes << endl;
cout << "Ones =" << ones;
cout << "  Threes =" << threes;
cout << "  Sevens =" << sevens;
cout << "  Nines =" << nines << endl;
cout << "  xxcount3 " << xxcount3 << endl;
cout << "  allNums " << allNums << endl;
cout << "  ones2 " << ones2 << endl;
cout << "  Threes2 =" << threes2;
cout << "  Sevens2 =" << sevens2;
cout << "  Nines2 =" << nines2 << endl;
cout << "Non primes2 = " << nonPrimes2 << endl;
cout << "allLines = " << allLines << endl;
cout << "Sum of primes = " << ones2+threes2+sevens2+nines2 << endl;
 return 0;
}