minor edit, and adding to the end

going deeper on shuffles
started analysis of shuffles
2025-07-23 21:03:37 +00:00 · 2025-07-23 20:33:57 +00:00 · 2025-07-21 22:51:07 +00:00
60 changed files with 519 additions and 4238 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -2,5 +2,3 @@
 .DS_Store
 output
 __pycache__
-.ipynb_checkpoints
-
--- a/init.py
+++ b/init.py
--- a/docs/nkode_analysis.md
+++ b/docs/nkode_analysis.md
@@ -1,20 +1,19 @@
-# nKode analysis
+ # nKode analysis

 Luke Oeding (Auburn University)

 ## What is an nKode?
-
 An nKode consists of a passcode $P$ selected utilizing the following setup:

- a $k$ digit keypad, where each key has:
- $p$ properties (position, central number, color, letter, emoji,...)
- $m$ options for each property ($\#$ positions, $\#$ central numbers, $\#$ colors, $\#$ letters, $\#$ emoji, ... )
+* a $k$ digit keypad, where each key has:
+* $p$ properties (position, central number, color, letter, emoji,...)
+* $m$ options for each property (\# positions, \# central numbers, \# colors, \# letters, \# emoji, ... )

    - In principle the number of options for each property doesn't have to be the same, but for the sake of our analysis, we make this uniform choice.
    - Typically one wants to have all letters displayed exactly once on the keypad for any instance of a keypad, so we take $m = k$.

- $\ell$ = length of the passcode, which is a sequence of $\ell$ letters (options) selected from any of the options.
- a shuffling rule (the split-shuffle)
+* $\ell$ = length of the passcode, which is a sequence of $\ell$ letters (options) selected from any of the options. 
+* a shuffling rule (the split-shuffle)

 ## Split shuffling

@@ -47,14 +46,14 @@ $$ E(m,p,\ell) = \log_2[(mp)^\ell] = \ell \frac{\log(mp)}{\log(2)}$$

 So, for example, when $m=10$ and $p = 7$, the alphabet has 70 letters, and a passcode of length $\ell = 4$ would have entropy:

-$$E(10,7,4) = 4\log(70)/\log(2) = 24.5\; \text{bits}$$
+$$ E(10,7,4) =  4\log(70)/\log(2) = 24.5\; \text{bits}$$

-$$E(10,7,6) = 6\log(70)/\log(2) = 36.8\; \text{bits}$$
+$$ E(10,7,6) =  6\log(70)/\log(2) = 36.8\; \text{bits}$$

-Relevant is the entropy per symbol, which is $E(m,p,1) = \log_2(m) +\log_2(p) = \frac{\log(m) + \log(p)}{\log(2)}$.
+Relevant is the entropy per symbol, which is $ E(m,p,1)  = \log_2(m) +\log_2(p) = \frac{\log(m) + \log(p)}{\log(2)}$.

 For instance:
-$E(10,7,1) = \log(70)/\log(2) = 6.13\; \text{bits}$
+$ E(10,7,1) =  \log(70)/\log(2) = 6.13\; \text{bits}$

 ### nKode Entropy

@@ -63,16 +62,16 @@ If one is interested in the likelihood of a single attempt randomly entered nKod
 The total number of available passcodes with
 $\ell$ letters selected from an alphabet with $k$ keys, with replacement is

-$$k^\ell$$
+$$ k^\ell $$

 So the entropy of the one time keypad is 

-$$E(k,\ell) = \log_2[(k)^\ell] = \ell \frac{\log(k)}{\log(2)}$$
+$$ E(k,\ell) = \log_2[(k)^\ell] = \ell \frac{\log(k)}{\log(2)}$$

 So, for example, when $k=10$ a passcode of length $\ell = 4$ would have entropy:
-$E(10,4) = 4\log(10)/\log(2) = 13.28\; \text{bits}$
+$ E(10,4) =  4\log(10)/\log(2) = 13.28\; \text{bits}$

-The entropy per symbol, which is $E(k,1) = \log_2(k) = \frac{\log(k)}{\log(2)}$.
+The entropy per symbol, which is $ E(k,1)  = \log_2(k) = \frac{\log(k)}{\log(2)}$.

 In the case $k=10$ we have $E(10,1) = 3.22 \; \text{bits}$. 

@@ -84,6 +83,7 @@ We are interested in understanding the number of times an eavesdropping intruder

 [Brooks Brown says that the lower bound is $\min\{3,  s \log_2+1\}$, where $s$ is the number of attribute sets, for an nKode of complexity $c=1$, regardless of nKode length $l$, or the number of tiles $t$.]

+
 Regarding the eavesdropper attack we should also consider the case of a keystroke recorder that doesn't observe the labels on the keys of the nKode keypad. 
 Blind single attacks and repeated blind attempts that successfully log in, and learning the nKode. 

@@ -95,11 +95,16 @@ Blind single attacks and repeated blind attempts that successfully log in, and l

 The probability that a (blind) randomly entered key-string will yield a successful login:

-$R = \frac{\#(\text{passcodes that would yield a correct login})}{\#(\text{possible passcodes})}.$
+$
+R = \frac{\text{Num}(\text{passcodes that would yield a correct login})}{\text{Num}(\text{possible passcodes})}.
+$

+(Num stands for 'number of'.)
 This should be simply computed using the number of keys $k$ and the length $l$ of the passcode:

-$R = (1/k)^l.$
+$
+R = (1/k)^l.
+$

 For example, a 6 digit pin would yield a one-in-a-million chance of blindly hitting the correct passcode. 

@@ -109,7 +114,7 @@ For the moment we only consider the case of $p=2$, which could be the case if th

 Note that if a user insists on only using the placement value of the keys in an nKode to select their password, then they effectively have reduced the complexity or entropy of their password to that of the normal keypad, and an adversary could use frequency analysis to increase the likelihood of guessing a correct password. However, if the user were to use only the number values on the keypad, they would still only have the entropy of a standard keypad generated password, however, because of the shuffling of the letters, using an nKode provides the user with some protection against frequency analysis attacks in the case of a blind intruder.

-For example, it is know that users pick passwords like 1234 or 123456 much more frequently than of any other password. If the intruder is able to observe the user typing one of these passcodes on an nKode keypad, then they would be able to guess the passcode with higher probability than random. However, if the intruder is only able to record keystrokes, but not see the display of the nKode keypad, then the intruder would only guess the correct passcode at the frequency of guessing a permutation of the correct passcode. In the case of $\ell$ consecutive digits, the keystroke intruder would only observe $\ell$ distinct keys being typed. The number of such passcodes is $\binom{k}{\ell}$. So in the case of a $k=10$ digit keypad, the number of $\ell=4$ digit passcodes with distinct entries is $\binom{10}{4} = 210$, and when $\ell = 6$ we also have $\binom{10}{6} = 210$. So, after one observation indicating that the passcode consists of $\ell$ distinct digits, the intruder would have probability $P = 1/\binom{k}{\ell}$ of guessing the passcode.
+For example, it is know that users pick passwords like 1234 or 123456 much more frequently than of any other password. If the intruder is able to observe the user typing one of these passcodes on an nKode keypad, then they would be able to guess the passcode with higher probability than random. However, if the intruder is only able to record keystrokes, but not see the display of the nKode keypad, then the intruder would only guess the correct passcode at the frequency of guessing a permutation of the correct passcode. In the case of $\ell $ consecutive digits, the keystroke intruder would only observe $\ell$ distinct keys being typed. The number of such passcodes is $\binom{k}{\ell}$. So in the case of a $k=10$ digit keypad, the number of $\ell=4$ digit passcodes with distinct entries is $\binom{10}{4} = 210$, and when $\ell = 6$ we also have $\binom{10}{6} = 210$. So, after one observation indicating that the passcode consists of $\ell$ distinct digits, the intruder would have probability $P = 1/\binom{k}{\ell}$ of guessing the passcode. 
 It is known that the expected number of trials until the first success would be $1/P$.  In this case it would take the intruder on average $\binom{k}{\ell}$ attempts to successfully log in (without guessing the passcode). Even in the case of the 4 digit PIN where the intruder guesses that the passcode consists of the first $4$ digits, (or consecutive or even just distinct) integers the nKode obtains an increase in security for the user by a factor of approximately 210 since the key recording intruder would guess the password 1234 on the first attempt on a standard keypad, but would need approximately 210 trials to successfully log in. 

 ### Multiple blind attempts for a large userbase (Password spraying)
@@ -138,9 +143,13 @@ For example,
 ### Guessing the passcode:

 The likelihood that a randomly chosen passcode will yield a successful login no matter what shuffle has been applied, i.e. so that the attacker can successfully log in as many times as they want:
-$1 / \#(\text{possible passcodes}).$
+$1 / \text{Num}(\text{possible passcodes}).$

-For example, when $k=m=10, p=7$ for $\ell = 4$ this probability is $(70)^{-4} \sim 4.16*10^{-8}$, or about 4 chances in 100 million, and for $\ell = 6$ this probability is $(70)^{-6} \sim 8.5*10^{-12}$, or about 8.5 chances in 1 trillion.
+For example, when $k=m=10, p=7$ for $\ell = 4$ this probability is 
+
+$(70)^{-4} \sim 4.16*10^{-8},$
+
+ or about 4 chances in 100 million, and for $\ell = 6$ this probability is $(70)^{-6} \sim 8.5*10^{-12}$, or about 8.5 chances in 1 trillion.  

 ## Multiple Blind Attempts

@@ -155,7 +164,6 @@ For example, when $k=10$ for $\ell = 4$ and $s=2$ this probability is $((10)^{-4
 For example, when $k=10$ for $\ell = 4$ and $s=3$ this probability is $((10)^{-4})^3 = 10^{-12}$, [same order of magnitude as a passcode of length 6], and for $\ell = 6$ this probability is $((10)^{-6})^3 = 10^{-18}$, or 1 chance in one billion billion. 

 ## Incorrect nKodes that still work
-
 We should also consider the number of nKodes that would yield a successful sequence of $s$ logins. [Add example]

 ## Longer passcodes might not always be more secure
@@ -185,3 +193,348 @@ If someone is able to observe the user typing the nKode, or record the keystroke
 ### Eye tracking

 Eye tracker on phone: How good would this need to be in order to see what attribute the user is searching for?
+
+# Higher Complexity
+
+## Dispersion
+Dispersion is an operation on a keypad that permutes properties in such a way that 2 observations of the nKode are sufficient to learn the passcode. It does this by applying a distinct rotation to each property. The authors note that this is possible when the number of keys is not larger than the number of properties per key, because this ensures that there are enough distinct rotations so that no repetitions occur.  There are more general permutations that can also have this property, and it seems that this is already implemented in the Enrollment_Login_Renewal. 
+
+## Split shuffle
+Split shuffle attempts to avoid the dispersion permutation of the keypad so as to increase the number of times an intruder would have to observe the nKode being entered.
+
+The properties are divided into 2 sets, each set will be shuffled by the same shuffle applied to all properties in that set of properties.
+
+Note: by observing both keypads (before and after a split-shuffle), one can learn both what the split was, and what the two shuffles were.
+
+We're intertested in studying how many observations an intruder must make in order to learn the passcode with the split shuffle in place. There are a few scenarios I can imagine. 
+
+    * No split (analyzed above).
+    * The split is determined once, and then the shuffles only happen on one side of the split.
+    * The split changes every time randomly.
+    * The split changes every time by a set strategy.
+
+Of course we can consider these questions each time the metaparameters change. Recall,  
+    * a $k$ digit keypad,
+    * $p$ properties (position, central number, color, letter, emoji,...)
+    * $m$ options for each property (\# positions, \# central numbers, \# colors, \# letters, \# emoji, ... ). typically  $m = k$.
+    * $\ell$ = length of the passcode, which is a sequence of $\ell$ letters (options) selected from any of the options. 
+
+Notice that when $k = 1$ the problem is nearly trivial. It doesn't matter what the properties are, the only thing the user is entering is $\ell$, and that can be observed in 1 try.
+
+When $k = 2$. Here's the case $p = 2$ and a 4 letter passcode.
+
+      
+|key   | p0 | p1 |
+|:----:|:--:|:--:|
+|key 0 | a0 | b0 |
+|key 1 | a1 | b1 |
+
+After a shuffle [attribute 1]
+
+|key   | p0 | p1 |
+|:----:|:--:|:--:|
+|key 0 | a0 | b1 |
+|key 1 | a1 | b0 |
+
+interaction:
+
+|what |  |||||
+|--------|--|--|--|--|--|
+|Passcode| a0|b1|a1|b0|
+|Display 1| 0|1|1|0|
+|Display 2| 0|0|1|1|
+
+The possible passcodes after display 1: 
+0{a0,b0},1{a1,b1},1{a1,b1},0{a0,b0}
+
+The possible passcodes after display 2: 
+0{a0,b1},0{a0,b1},1{a1,b0},1{a1,b0}
+
+Intersect:
+{a0},{b1},{a1},{b0}
+Passcode learned in 2.
+
+
+When $k = 2$. Here's the case $p = 4$ and a 4 letter passcode.
+
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b0  | c0  | d0  |
+|key 1 | a1  | b1  | c1  | d1  |
+
+After a shuffle [attribute 1,2]
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b1  | c1  | d0  |
+|key 1 | a1  | b0  | c0  | d1  |
+
+
+interaction:
+
+|what |  |||||
+|--------|--|--|--|--|--|
+|Passcode| a0|c1|c1|d0|
+|Display 1| 0|1|1|0|
+|Display 2| 0|0|0|0|
+
+The possible passcodes after display 1: 
+0{a0,b0,c0,d0},1{a1,b1,c1,d1},1{a1,b1,c1,d1},0{a0,b0,c0,d0}
+
+The possible passcodes after display 2: 
+0{a0,b1,c1,d0},0{a0,b1,c1,d0},0{a0,b1,c1,d0},0{a0,b1,c1,d0}
+
+Intersect:
+{a0,d0},{b1,c1},{b1,c1},{a0,d0}.
+
+Passcode is not learned yet.
+
+After a 2nd shuffle [attribute 1,3]
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b1  | c0  | d1  |
+|key 1 | a1  | b0  | c1  | d0  |
+
+|what |  |||||
+|--------|--|--|--|--|--|
+|Passcode| a0|c1|c1|d0|
+|Display 1| 0|1|1|0|
+|Display 2| 0|0|0|0|
+|Display 3| 0|1|1|1|
+
+The possible passcodes after display 3: 
+0{a0,b1,c0,d1},1{a1,b0,c1,d0},1{a1,b0,c1,d0},1{a1,b0,c1,d0}
+
+Intersect:
+{a0},{c1},{c1},{d0}.
+
+Passcode learned in 3.
+
+
+Here's the case $p = 4$ and an 8 letter passcode.
+
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b0  | c0  | d0  |
+|key 1 | a1  | b1  | c1  | d1  |
+
+Shuffle 1 [attribute 1,2]
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b1  | c1  | d0  |
+|key 1 | a1  | b0  | c0  | d1  |
+
+Shuffle 2 [attribute 1,3]
+
+|key   | p0 | p1 | p2 | p3 |
+|:----:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b1  | c0  | d1  |
+|key 1 | a1  | b0  | c1  | d0  |
+
+interaction:
+
+|what |  ||||||||||
+|--------|--|--|--|--|--|--|--|--|--|--|
+|Passcode| a0|c1|c1|d0|b1|b0|a1|d0|
+|Display 1| 0| 1| 1| 0| 1| 0| 1| 0|
+|Display 2| 0| 0| 0| 0| 0| 1| 1| 0|
+|Display 3| 0| 1| 1| 1| 0| 1| 1| 1|
+
+The possible passcodes after display 1: 
+0{a0,b0,c0,d0},1{a1,b1,c1,d1},1{a1,b1,c1,d1},0{a0,b0,c0,d0},
+1{a1,b1,c1,d1},0{a0,b0,c0,d0},1{a1,b1,c1,d1},0{a0,b0,c0,d0}
+
+The possible passcodes after display 2: 
+0{a0,b1,c1,d0},0{a0,b1,c1,d0},0{a0,b1,c1,d0},0{a0,b1,c1,d0},
+0{a0,b1,c1,d0},1{a1,b0,c0,d1},1{a1,b0,c0,d1},0{a0,b1,c1,d0}
+
+Intersect:
+{a0,d0},{b1,c1},{b1,c1},{a0,d0},{b1,c1},{b0,c0},{a1,d1},{a0,d0}
+
+Passcode is not learned yet.
+
+The possible passcodes after display 3: 
+0{a0,b1,c0,d1},1{a1,b0,c1,d0},1{a1,b0,c1,d0},1{a1,b0,c1,d0},
+0{a0,b1,c0,d1},1{a1,b0,c1,d0},1{a1,b0,c1,d0},1{a1,b0,c1,d0}
+
+Intersect:
+{a0},{c1},{c1},{d0},{b1},{b0},{a1},{d0}
+
+Passcode learned in 3.
+
+
+Here's the case $p = 8$ and an 4 letter passcode.
+
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b0  | c0  | d0  | e0  | f0  | g0  | h0  |
+|key 1 | a1  | b1  | c1  | d1  | e1  | f1  | g1  | h1  |
+
+Shuffle 1 [attribute 0,2,4,6] [a,c,e,g]
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a1  | b0  | c1  | d0  | e1  | f0  | g1  | h0  |
+|key 1 | a0  | b1  | c0  | d1  | e0  | f1  | g0  | h1  |
+
+Shuffle 2 [attribute 2,3,4,5] [c,d,e,f]
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b0  | c1  | d1  | e1  | f1  | g0  | h0  |
+|key 1 | a1  | b1  | c0  | d0  | e0  | f0  | g1  | h1  |
+
+Shuffle 3 [attribute 0,4,5,6] [a,e,f,g]
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a1  | b0  | c0  | d0  | e1  | f1  | g1  | h0  |
+|key 1 | a0  | b1  | c1  | d1  | e0  | f0  | g0  | h1  |
+
+
+interaction:
+
+|what |  ||||||||||
+|--------|--|--|--|--|--|--|--|--|--|--|
+|Passcode| a1|c1|b0|h0|f0|e1|a1|d0|
+|Display 1| 1| 1| 0| 0| 0| 1| 1| 0|
+|Display 2| 0| 0| 0| 0| 0| 0| 0| 0|
+|Display 3| 1| 0| 0| 0| 1| 0| 1| 1|
+|Display 4| 0| 1| 0| 0| 1| 0| 0| 0|
+
+The possible passcodes from display 1:  
+`1{a1,b1,c1,d1,e1,f1,g1,h1},`  
+`1{a1,b1,c1,d1,e1,f1,g1,h1},`  
+`0{a0,b0,c0,d0,e0,f0,g0,h0},`  
+`0{a0,b0,c0,d0,e0,f0,g0,h0},`  
+`0{a0,b0,c0,d0,e0,f0,g0,h0},`  
+`1{a1,b1,c1,d1,e1,f1,g1,h1},`  
+`1{a1,b1,c1,d1,e1,f1,g1,h1},`  
+`0{a0,b0,c0,d0,e0,f0,g0,h0}`
+
+The possible passcodes from display 2:  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0},`  
+`0{a1,b0,c1,d0,e1,f0,g1,h0}`
+
+Intersect:  
+`1{a1,c1,e1,g1},`  
+`1{a1,c1,e1,g1},`  
+`0{b0,d0,f0,h0},`  
+`0{b0,d0,f0,h0},`  
+`0{b0,d0,f0,h0},`  
+`1{a1,c1,e1,g1},`  
+`1{a1,c1,e1,g1},`  
+`0{b0,d0,f0,h0}`  
+
+Passcode is not learned yet.
+
+Shuffle 2 [attribute 2,3,4,5] [c,d,e,f]
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a0  | b0  | c1  | d1  | e1  | f1  | g0  | h0  |
+|key 1 | a1  | b1  | c0  | d0  | e0  | f0  | g1  | h1  |
+
+So before observing the input for the 3rd attempt:
+Hits for each key from prior information:
+[1001][1001]
+(1/2 are 1's are 1/2 are 0's for every key.)
+
+Now observe the sequence 10001011
+The possible passcodes from display 3:  
+`1{a1,b1,c0,d0,e0,f0,g1,h1},`  
+`0{a0,b0,c1,d1,e1,f1,g0,h0},`  
+`0{a0,b0,c1,d1,e1,f1,g0,h0},`  
+`0{a0,b0,c1,d1,e1,f1,g0,h0},`  
+`1{a1,b1,c0,d0,e0,f0,g1,h1},`  
+`0{a0,b0,c1,d1,e1,f1,g0,h0},`  
+`1{a1,b1,c0,d0,e0,f0,g1,h1},`  
+`1{a1,b1,c0,d0,e0,f0,g1,h1}`  
+
+Intersect with prior information:  
+`1{a1,g1},`  
+`0{c1,e1},`  
+`0{b0,h0},`  
+`0{b0,h0},`  
+`1{d0,f0},`  
+`0{c1,e1},`  
+`1{a1,g1},`  
+`1{d0,f0}`  
+
+Passcode not learned yet, but after one more shuffle, we think we would learn the passcode.
+
+Shuffle 3 [attribute 0,4,5,6] [a,e,f,g]
+
+|key   | p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
+|:----:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|key 0 | a1  | b0  | c0  | d0  | e1  | f1  | g1  | h0  |
+|key 1 | a0  | b1  | c1  | d1  | e0  | f0  | g0  | h1  |
+
+So before observing the input for the 4th attempt:  
+Hits for each key from prior information:  
+[0][10][0][0][0][10][0][01]  
+(So several of the correct keys are identified, only letters in positions 1,5,7 are unknown)
+
+
+The possible passcodes from display 4:   
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+`1{a0,b1,c1,d1,e0,f0,g0,h1},`  
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+`1{a0,b1,c1,d1,e0,f0,g0,h1},`  
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+`0{a1,b0,c0,d0,e1,f1,g1,h0},`  
+
+Intersect:  
+`{a1,g1},`  
+`{c1},`  
+`{b0,h0},`  
+`{b0,h0},`  
+`{f0},`  
+`{e1},`  
+`{a1,g1},`  
+`{d0}`  
+
+Passcode only paritally learned after 3 shuffles
+- uncertain about positions 0,2,3,6  = {a1 or g1}, {b0 or h0}
+- certain about positions 1,4,5,7 = c1,f0,e1,d0
+
+Shuffles:  
+[],   
+[attribute 0,2,4,6] [a c e g ]  
+[attribute 2,3,4,5] [  cdef  ]  
+[attribute 0,4,5,6] [a   efg ]  
+1,7 [b,h] - not shuffled 4 times, so the key pressed was always the same for the 3rd and 4th character.
+
+3 -- shuffled 1 time, not shuffled 3 times  
+4 -- shuffled 3 times, not shuffled 1 time  
+2,5,6 -- shuffled 2 times, not shuffled 2 times  
+
+Note that this passcode didn't use the letter g, 
+
+This experiment makes us think of two learning tasks. 
+1) The ability to learn any possible passcode character from observations of inputs
+2) The liklihood of learning a given passcode of a fixed length.
+3) Could you get the correct entry even with partial information?
+4) Probabilistic attack for guessing the correct sequence? 
+    First split shuffle evenly splits things, so you don't gain much. Subsequent shuffles will start to bias toward the correct passcode. The trade-off is that if you don't shuffle a position then the intruder can just use the same input as before and can enter the correct key without knowing the correct icon, but if you do shuffle, then the intruder learns more information.
+
+
+
+One key I’m understanding better now: There’s a trade-off between (1) information being learned by the intruder and (2) the chances that an intruder could key in a correct key-sequence for each subsequent observation / login attempt.
+
+If an attribute is not shuffled, then that increases the chance for (2), and if an attribute is shuffled that increases the chance for (1). 
+
+I think that when the selection of the sets is randomized like you’re doing, you’re getting the optimal trade-off between these two things. However, there’s also probably a strategy that might be even more optimal than just randomly choosing a split each time. If each shuffle chooses for its set of half of the attributes: half from the attributes that were previously shuffled and half from the attributes that weren’t shuffled in the previous shuffle. You could look back essentially log_2(p) steps (where p is the number of attributes) and similarly subdivide. This seems to be a way to balance the tradeoff for consecutive observations.
--- a/example/obs_json/observation_1.json
+++ b/example/obs_json/observation_1.json
--- a/example/obs_png/run_001/observation_001.png
+++ b/example/obs_png/run_001/observation_001.png
--- a/example/obs_png/run_001/observation_002.png
+++ b/example/obs_png/run_001/observation_002.png
--- a/example/obs_png/run_001/observation_003.png
+++ b/example/obs_png/run_001/observation_003.png
--- a/example/obs_png/run_001/observation_004.png
+++ b/example/obs_png/run_001/observation_004.png
--- a/example/obs_png/run_001/observation_005.png
+++ b/example/obs_png/run_001/observation_005.png
--- a/example/obs_png/run_001/observation_006.png
+++ b/example/obs_png/run_001/observation_006.png
--- a/example/obs_png/run_001/observation_007.png
+++ b/example/obs_png/run_001/observation_007.png
--- a/example/obs_png/run_001/observation_008.png
+++ b/example/obs_png/run_001/observation_008.png
--- a/example/obs_png/run_001/observation_009.png
+++ b/example/obs_png/run_001/observation_009.png
--- a/example/obs_png/run_001/observation_010.png
+++ b/example/obs_png/run_001/observation_010.png
--- a/example/obs_png/run_001/observation_011.png
+++ b/example/obs_png/run_001/observation_011.png
--- a/example/obs_png/run_001/observation_012.png
+++ b/example/obs_png/run_001/observation_012.png
--- a/example/obs_png/run_001/observation_013.png
+++ b/example/obs_png/run_001/observation_013.png
--- a/example/obs_png/run_001/observation_014.png
+++ b/example/obs_png/run_001/observation_014.png
--- a/example/obs_png/run_001/observation_015.png
+++ b/example/obs_png/run_001/observation_015.png
--- a/example/obs_png/run_001/observation_016.png
+++ b/example/obs_png/run_001/observation_016.png
--- a/example/obs_png/run_001/observation_017.png
+++ b/example/obs_png/run_001/observation_017.png
--- a/example/obs_png/run_001/observation_018.png
+++ b/example/obs_png/run_001/observation_018.png
--- a/example/obs_png/run_001/observation_019.png
+++ b/example/obs_png/run_001/observation_019.png
--- a/example/obs_png/run_001/observation_020.png
+++ b/example/obs_png/run_001/observation_020.png
--- a/example/obs_png/run_001/observation_021.png
+++ b/example/obs_png/run_001/observation_021.png
--- a/example/obs_png/run_001/observation_022.png
+++ b/example/obs_png/run_001/observation_022.png
--- a/example/obs_png/run_001/observation_023.png
+++ b/example/obs_png/run_001/observation_023.png
--- a/example/obs_png/run_001/observation_024.png
+++ b/example/obs_png/run_001/observation_024.png
--- a/example/obs_png/run_001/observation_025.png
+++ b/example/obs_png/run_001/observation_025.png
--- a/example/obs_png/run_001/observation_026.png
+++ b/example/obs_png/run_001/observation_026.png
--- a/example/obs_png/run_001/observation_027.png
+++ b/example/obs_png/run_001/observation_027.png
--- a/example/obs_png/run_001/observation_028.png
+++ b/example/obs_png/run_001/observation_028.png
--- a/example/obs_png/run_001/observation_029.png
+++ b/example/obs_png/run_001/observation_029.png
--- a/example/obs_png/run_001/observation_030.png
+++ b/example/obs_png/run_001/observation_030.png
--- a/example/obs_png/run_001/observation_031.png
+++ b/example/obs_png/run_001/observation_031.png
--- a/example/obs_png/run_001/observation_032.png
+++ b/example/obs_png/run_001/observation_032.png
--- a/example/obs_png/run_001/observation_033.png
+++ b/example/obs_png/run_001/observation_033.png
--- a/example/obs_png/run_001/observation_034.png
+++ b/example/obs_png/run_001/observation_034.png
--- a/example/obs_png/run_001/observation_035.png
+++ b/example/obs_png/run_001/observation_035.png
--- a/example/obs_png/run_001/observation_036.png
+++ b/example/obs_png/run_001/observation_036.png
--- a/example/obs_png/run_001/observation_037.png
+++ b/example/obs_png/run_001/observation_037.png
--- a/example/obs_png/run_001/observation_038.png
+++ b/example/obs_png/run_001/observation_038.png
--- a/example/obs_png/run_001/observation_039.png
+++ b/example/obs_png/run_001/observation_039.png
--- a/example/obs_png/run_001/observation_040.png
+++ b/example/obs_png/run_001/observation_040.png
--- a/example/obs_png/run_001/observation_041.png
+++ b/example/obs_png/run_001/observation_041.png
--- a/example/obs_png/run_001/observation_042.png
+++ b/example/obs_png/run_001/observation_042.png
--- a/example/obs_png/run_001/observation_043.png
+++ b/example/obs_png/run_001/observation_043.png
--- a/example/obs_png/run_001/observation_044.png
+++ b/example/obs_png/run_001/observation_044.png
--- a/example/obs_png/run_001/observation_045.png
+++ b/example/obs_png/run_001/observation_045.png
--- a/example/obs_png/run_001/observation_046.png
+++ b/example/obs_png/run_001/observation_046.png
--- a/example/obs_png/run_001/observation_047.png
+++ b/example/obs_png/run_001/observation_047.png
--- a/example/obs_png/run_001/observation_048.png
+++ b/example/obs_png/run_001/observation_048.png
--- a/example/obs_png/run_001/observation_049.png
+++ b/example/obs_png/run_001/observation_049.png
--- a/example/obs_png/run_001/observation_050.png
+++ b/example/obs_png/run_001/observation_050.png
--- a/notebooks/evilkode.ipynb
+++ b/notebooks/evilkode.ipynb
--- a/src/benchmark.py
+++ b/src/benchmark.py
@@ -1,17 +1,82 @@
-from src.evilkode import Evilkode
+from src.evilkode import Observation, Evilkode
+from src.keypad import Keypad
+import random
 from dataclasses import dataclass
 from statistics import mean, variance
+from enum import Enum
 from pathlib import Path

-from src.utils import ShuffleTypes, observations, passcode_generator
-
-
@dataclass
 class Benchmark:
    mean: int
    variance: int
    runs: list[int]

+class ShuffleTypes(Enum):
+    FULL_SHUFFLE = "FULL_SHUFFLE"
+    SPLIT_SHUFFLE = "SPLIT_SHUFFLE"
+
+def observations(number_of_keys, properties_per_key, passcode_len, complexity: int, disparity: int, shuffle_type: ShuffleTypes):
+    k = number_of_keys
+    p = properties_per_key
+    n = passcode_len
+    passcode =  passcode_generator(k, p, n, complexity, disparity)
+    keypad = Keypad.new_keypad(k, p)
+
+    def obs_gen():
+        for _ in range(100):  # finite number of yields
+            yield Observation(
+                keypad=keypad.keypad.copy(),
+                key_selection=keypad.key_entry(target_passcode=passcode)
+            )
+            match shuffle_type:
+                case ShuffleTypes.FULL_SHUFFLE:
+                    keypad.full_shuffle()
+                case ShuffleTypes.SPLIT_SHUFFLE:
+                    keypad.split_shuffle()
+                case _:
+                    raise Exception(f"no shuffle type {shuffle_type}")
+
+    return obs_gen()
+
+def passcode_generator(k: int, p: int, n: int, c: int, d: int) -> list[int]:
+    assert n >= c
+    assert p*k >= c
+
+    assert n >= d
+    assert p >= d
+    passcode_prop = []
+    passcode_set = []
+    valid_choices = {i for i in range(k*p)}
+    repeat_set = n-d
+    repeat_prop = n-c
+    prop_added = set()
+    set_added = set()
+
+    for _ in range(n):
+        prop = random.choice(list(valid_choices))
+        prop_set = prop//p
+        passcode_prop.append(prop)
+        passcode_set.append(prop_set)
+
+        if prop in prop_added:
+            repeat_prop -= 1
+        if prop_set in set_added:
+            repeat_set -= 1
+
+        prop_added.add(prop)
+        set_added.add(prop_set)
+
+        if repeat_prop <= 0:
+            valid_choices -= prop_added
+
+        if repeat_set <= 0:
+            for el in valid_choices.copy():
+                if el // p in set_added:
+                    valid_choices.remove(el)
+
+    return passcode_prop
+

 def shuffle_benchmark(
        number_of_keys: int,
@@ -29,6 +94,7 @@ def shuffle_benchmark(
    full_path = Path(file_path) / file_name
    if not overwrite and full_path.exists():
        print(f"file exists {file_path}")
+
        with open(full_path, "r") as fp:
            runs = fp.readline()
            runs = runs.split(',')
@@ -40,14 +106,13 @@ def shuffle_benchmark(
            )
    runs = []
    for _ in range(run_count):
-        passcode = passcode_generator(number_of_keys, properties_per_key, passcode_len, complexity, disparity)
        evilkode = Evilkode(
            observations=observations(
-                target_passcode=passcode,
                number_of_keys=number_of_keys,
                properties_per_key=properties_per_key,
-                min_complexity=complexity,
-                min_disparity=disparity,
+                passcode_len=passcode_len,
+                complexity=complexity,
+                disparity=disparity,
                shuffle_type=shuffle_type,
            ),
            number_of_keys=number_of_keys,
@@ -80,14 +145,13 @@ def full_shuffle_benchmark(
 ) -> Benchmark:
    runs = []
    for _ in range(run_count):
-        passcode = passcode_generator(number_of_keys, properties_per_key, passcode_len, complexity, disparity)
        evilkode = Evilkode(
            observations=observations(
-                target_passcode=passcode,
                number_of_keys=number_of_keys,
                properties_per_key=properties_per_key,
-                min_complexity=complexity,
-                min_disparity=disparity,
+                passcode_len=passcode_len,
+                complexity=complexity,
+                disparity=disparity,
                shuffle_type=ShuffleTypes.FULL_SHUFFLE,
            ),
            number_of_keys=number_of_keys,
--- a/src/evilkode.py
+++ b/src/evilkode.py
@@ -44,4 +44,5 @@ class Evilkode:
                return EvilOutput(possible_nkodes=[list(el) for el in self.possible_nkode], iterations=idx+1)
            for jdx, props in enumerate(obs.property_list):
                self.possible_nkode[jdx] = props.intersection(self.possible_nkode[jdx])
+
        raise Exception("error in Evilkode, observations stopped yielding")
--- a/src/utils.py
+++ b/src/utils.py
@@ -1,79 +1,7 @@
-import random
-from enum import Enum
 from math import factorial, comb

-from src.evilkode import Observation
-from src.keypad import Keypad
-
-
 def total_valid_nkode_states(k: int, p: int) -> int:
   return factorial(k) ** (p-1)

 def total_shuffle_states(k: int, p: int) -> int:
    return comb((p-1), (p-1) // 2) * factorial(k)
-
-
-class ShuffleTypes(Enum):
-    FULL_SHUFFLE = "FULL_SHUFFLE"
-    SPLIT_SHUFFLE = "SPLIT_SHUFFLE"
-
-
-def observations(target_passcode: list[int], number_of_keys:int, properties_per_key: int, min_complexity: int, min_disparity: int, shuffle_type: ShuffleTypes, number_of_observations: int = 100):
-    k = number_of_keys
-    p = properties_per_key
-    keypad = Keypad.new_keypad(k, p)
-
-    def obs_gen():
-        for _ in range(number_of_observations):
-            yield Observation(
-                keypad=keypad.keypad.copy(),
-                key_selection=keypad.key_entry(target_passcode=target_passcode)
-            )
-            match shuffle_type:
-                case ShuffleTypes.FULL_SHUFFLE:
-                    keypad.full_shuffle()
-                case ShuffleTypes.SPLIT_SHUFFLE:
-                    keypad.split_shuffle()
-                case _:
-                    raise Exception(f"no shuffle type {shuffle_type}")
-
-    return obs_gen()
-
-
-def passcode_generator(k: int, p: int, n: int, c: int, d: int) -> list[int]:
-    assert n >= c
-    assert p*k >= c
-
-    assert n >= d
-    assert p >= d
-    passcode_prop = []
-    passcode_set = []
-    valid_choices = {i for i in range(k*p)}
-    repeat_set = n-d
-    repeat_prop = n-c
-    prop_added = set()
-    set_added = set()
-
-    for _ in range(n):
-        prop = random.choice(list(valid_choices))
-        prop_set = prop//p
-        passcode_prop.append(prop)
-        passcode_set.append(prop_set)
-
-        if prop in prop_added:
-            repeat_prop -= 1
-        if prop_set in set_added:
-            repeat_set -= 1
-
-        prop_added.add(prop)
-        set_added.add(prop_set)
-
-        if repeat_prop <= 0:
-            valid_choices -= prop_added
-
-        if repeat_set <= 0:
-            for el in valid_choices.copy():
-                if el // p in set_added:
-                    valid_choices.remove(el)
-
-    return passcode_prop
--- a/src/visualnkode.py
+++ b/src/visualnkode.py
@@ -1,243 +0,0 @@
-import json
-from dataclasses import dataclass, asdict
-from evilkode import Observation
-from utils import observations, passcode_generator, ShuffleTypes
-from pathlib import Path
-from PIL import Image, ImageDraw, ImageFont
-from typing import Iterable
-
-# Project root = parent of *this* file's directory
-PROJECT_ROOT = Path(__file__).resolve().parent.parent
-OUTPUT_DIR = PROJECT_ROOT / "example" / "obs_json"
-PNG_DIR = PROJECT_ROOT / "example" / "obs_png"
-
-@dataclass
-class ObservationSequence:
-    target_passcode: list[int]
-    observations: list[Observation]
-
-def new_observation_sequence(
-        number_of_keys: int,
-        properties_per_key: int,
-        passcode_len: int,
-        complexity: int,
-        disparity: int,
-        numb_runs: int,
-) -> ObservationSequence:
-    passcode = passcode_generator(number_of_keys, properties_per_key, passcode_len, complexity, disparity)
-    obs_seq = ObservationSequence(target_passcode=passcode, observations=[])
-    obs_gen = observations(
-        target_passcode=passcode,
-        number_of_keys=number_of_keys,
-        properties_per_key=properties_per_key,
-        min_complexity=complexity,
-        min_disparity=disparity,
-        shuffle_type=ShuffleTypes.SPLIT_SHUFFLE,
-        number_of_observations=numb_runs,
-    )
-    for obs in obs_gen:
-        obs.keypad = obs.keypad.tolist()
-        obs_seq.observations.append(obs)
-
-    return obs_seq
-
-def _next_json_filename(base_dir: Path) -> Path:
-    """Find the next available observation_X.json file in base_dir."""
-    counter = 1
-    while True:
-        candidate = base_dir / f"observation_{counter}.json"
-        if not candidate.exists():
-            return candidate
-        counter += 1
-
-def save_observation_sequence_to_json(seq: ObservationSequence, filename: Path | None = None) -> None:
-    """
-    Save ObservationSequence to JSON.
-    - If filename is None, put it under PROJECT_ROOT/output/obs_json/ as observation_{n}.json
-    - Creates directory if needed
-    """
-    if filename is None:
-        base_dir =  OUTPUT_DIR
-        base_dir.mkdir(parents=True, exist_ok=True)
-        filename = _next_json_filename(base_dir)
-    else:
-        filename.parent.mkdir(parents=True, exist_ok=True)
-
-    with filename.open("w", encoding="utf-8") as f:
-        json.dump(asdict(seq), f, indent=4)
-
-# ---------- Helpers ----------
-def _load_font(preferred: str, size: int) -> ImageFont.FreeTypeFont | ImageFont.ImageFont:
-    """Try a preferred TTF, fall back to common monospace, then PIL default."""
-    candidates = [
-        preferred,
-        "DejaVuSansMono.ttf",       # common on Linux
-        "Consolas.ttf",             # Windows
-        "Menlo.ttc", "Menlo.ttf",   # macOS
-        "Courier New.ttf",
-    ]
-    for c in candidates:
-        try:
-            return ImageFont.truetype(c, size)
-        except Exception:
-            continue
-    return ImageFont.load_default()
-
-def _text_size(draw: ImageDraw.ImageDraw, text: str, font: ImageFont.ImageFont) -> tuple[int, int]:
-    """Get (w, h) using font bbox for accurate layout."""
-    left, top, right, bottom = draw.textbbox((0, 0), text, font=font)
-    return right - left, bottom - top
-
-def _join_nums(nums: Iterable[int]) -> str:
-    return " ".join(str(n) for n in nums)
-
-def _next_available_path(path: Path) -> Path:
-    """If path exists, append _1, _2, ..."""
-    if not path.exists():
-        return path
-    base, suffix = path.stem, path.suffix or ".png"
-    i = 1
-    while True:
-        candidate = path.with_name(f"{base}_{i}{suffix}")
-        if not candidate.exists():
-            return candidate
-        i += 1
-
-# ---------- Core rendering ----------
-def render_observation_to_png(
-    target_passcode: list[int],
-    obs: Observation,
-    out_path: Path,
-    *,
-    header_font_name: str = "DejaVuSans.ttf",
-    body_font_name: str = "DejaVuSans.ttf",
-    header_size: int = 28,
-    body_size: int = 24,
-    margin: int = 32,
-    row_padding_xy: tuple[int, int] = (16, 12),   # (x, y) padding inside row box
-    row_spacing: int = 14,
-    header_spacing: int = 10,
-    section_spacing: int = 18,
-    bg_color: str = "white",
-    fg_color: str = "black",
-    row_fill: str = "#f7f7f7",
-    row_outline: str = "#222222",
-):
-    """
-    Render a single observation:
-    - Top lines:
-        Target Passcode: {target}
-        Selected Keys:   {selected keys}
-    - Then a stack of row boxes representing the keypad rows.
-    """
-    out_path.parent.mkdir(parents=True, exist_ok=True)
-    out_path = _next_available_path(out_path)
-
-    # Fonts
-    header_font = _load_font(header_font_name, header_size)
-    body_font = _load_font(body_font_name, body_size)
-
-    # Prepare strings
-    header1 = f"Target Passcode: {_join_nums(target_passcode)}"
-    header2 = f"Selected Keys:  {_join_nums(obs.key_selection)}"
-    row_texts = [_join_nums(row) for row in obs.keypad]
-
-    # Measure to compute canvas size
-    # Provisional image for measurement
-    temp_img = Image.new("RGB", (1, 1), bg_color)
-    d = ImageDraw.Draw(temp_img)
-
-    h1_w, h1_h = _text_size(d, header1, header_font)
-    h2_w, h2_h = _text_size(d, header2, header_font)
-
-    row_text_sizes = [_text_size(d, t, body_font) for t in row_texts]
-    row_box_widths = [tw + 2 * row_padding_xy[0] for (tw, th) in row_text_sizes]
-    row_box_heights = [th + 2 * row_padding_xy[1] for (tw, th) in row_text_sizes]
-
-    content_width = max([h1_w, h2_w] + (row_box_widths or [0]))
-    total_rows_height = sum(row_box_heights) + row_spacing * max(0, len(row_box_heights) - 1)
-
-    width = content_width + 2 * margin
-    height = (
-        margin
-        + h1_h
-        + header_spacing
-        + h2_h
-        + section_spacing
-        + total_rows_height
-        + margin
-    )
-
-    # Create final image
-    img = Image.new("RGB", (max(width, 300), max(height, 200)), bg_color)
-    draw = ImageDraw.Draw(img)
-
-    # Draw headers
-    x = margin
-    y = margin
-    draw.text((x, y), header1, font=header_font, fill=fg_color)
-    y += h1_h + header_spacing
-    draw.text((x, y), header2, font=header_font, fill=fg_color)
-    y += h2_h + section_spacing
-
-    # Draw row boxes with evenly spaced numbers
-    max_box_width = max(row_box_widths) if row_box_widths else 0
-    for row, box_h in zip(obs.keypad, row_box_heights):
-        box_left = x
-        box_top = y
-        box_right = x + max_box_width
-        box_bottom = y + box_h
-
-        # draw row rectangle
-        draw.rectangle(
-            [box_left, box_top, box_right, box_bottom],
-            fill=row_fill,
-            outline=row_outline,
-            width=2
-        )
-
-        # evenly spaced numbers
-        n = len(row)
-        if n > 0:
-            available_width = max_box_width - 2 * row_padding_xy[0]
-            spacing = available_width / (n + 1)
-
-            for idx, num in enumerate(row, start=1):
-                num_text = str(num)
-                num_w, num_h = _text_size(draw, num_text, body_font)
-                num_x = box_left + row_padding_xy[0] + spacing * idx - num_w / 2
-                num_y = box_top + (box_h - num_h) // 2
-                draw.text((num_x, num_y), num_text, font=body_font, fill=fg_color)
-
-        y = box_bottom + row_spacing
-
-    img.save(out_path, format="PNG")
-
-def _next_run_dir(base_dir: Path) -> Path:
-    """Find the next available run directory under base_dir (run_001, run_002, ...)."""
-    counter = 1
-    while True:
-        run_dir = base_dir / f"run_{counter:03d}"
-        if not run_dir.exists():
-            run_dir.mkdir(parents=True)
-            return run_dir
-        counter += 1
-
-def render_sequence_to_pngs(seq: ObservationSequence, out_dir: Path | None = None) -> None:
-    """
-    Render each observation to its own PNG inside a fresh run directory.
-    Default: PROJECT_ROOT/output/obs_png/run_XXX/observation_001.png
-    """
-    base_dir = PNG_DIR if out_dir is None else out_dir
-    base_dir.mkdir(parents=True, exist_ok=True)
-
-    # Create a fresh run dir
-    run_dir = _next_run_dir(base_dir)
-
-    for i, obs in enumerate(seq.observations, start=1):
-        filename = run_dir / f"observation_{i:03d}.png"
-        render_observation_to_png(seq.target_passcode, obs, filename)
-if __name__ == "__main__":
-    obs_seq = new_observation_sequence(6, 9,4,0,0,numb_runs=50)
-    save_observation_sequence_to_json(obs_seq)
-    render_sequence_to_pngs(obs_seq)
--- a/tests/test_benchmark.py
+++ b/tests/test_benchmark.py
@@ -1,4 +1,4 @@
-from src.utils import passcode_generator
+from src.benchmark import passcode_generator
 import pytest

@pytest.mark.parametrize(
Author	SHA1	Message	Date
loeding	90d8c6cd13	minor edit, and adding to the end	2025-07-23 21:03:37 +00:00
loeding	38d0a68371	going deeper on shuffles	2025-07-23 20:33:57 +00:00
loeding	ad2e53195c	started analysis of shuffles	2025-07-21 22:51:07 +00:00