Cryptatools-rs Philosophy

cryptatools-rs is a cryptanalysis tool against custom or wrongly implemented algorithm.

This tool aims to be professionnal. Not only a learning tool. It is for realistic exploitation and code breaking.

You can "plug-in" your script to any protocol. Man in the midle as well as blockchain core as well as anything. Example:

You are able to use pypcap python library to read packets and then dpkt python library to parse these and then you can use cryptatools to break encryption on these packets. This is why this library is avaible in many bindings such as python.
You are able to use rust-web3 to parse a vulnerable cryptocurrency (shitcoin) blockchain hash tree to steal money using collision attack to forge a signature. See this reference.

You can also work on programs obfuscated by encryption such as malware. In this case, you can decipher program data (eg: data contained in a dropper) as well as self-encrypyted code. In this way you can plug cryptatools with your favorite reverse engineering framwork. Eg:

Install radare2. Then do radare2 -AA -i <yourscriptname>.rs <yourmalwaretoreverse>. If you work with python bindings, radare2 -AA -i <yourscriptname>.py <yourmalwaretoreverse>
You can also work on extracted code from the malware:

radare2 <malwaretoreverse>; 
=> s sym.encrypted;
pr 12345 > ciphertext.bin

Where 12345 is the size of the encrypted function or code of the malware.

The library is very very flexible. One of the main concept is to break custom cryptography. That is why you can meet classic cryptanalysis in cryptatools. This flexibility also aim to break obfuscation/encrypted malware. These are often written in assembly language because they deal with the system and then need to reimplement a lot of things and so their encryption method are often poorly written.

You can automatize any task. There is a command line interface.

Licence

This software has a free software (libre) license. The license is GPL3.

Fell free to ask me if you absolutely want to get a double licensing name. Then I could choose.

Contact

Join: our discord server
Project Link: https://github.com/gogo2464/cryptatools-rs
gogo - gogo246475@gmail.com

Getting Started

This is a cryptanalysis tool against custom or wrongly implemented algorithm for cybersecurity researcher, exploit developers and ctf players.

The name come from pwntools a similar tool to exploit memory corruption vulnerabilities. This software aims to work like pwntools but for cryptanalysis.

Then this program include a library like pwnlib. And it will expose some command line tools. Like pwntools. This is a rewrite of the python version in rust in order to be able to be fast and portable and usable in the following various languages rust, python, ruby, javascript and kotlin together.

Full documentation is avaible at this address.

Installation

1-Rust library installation.

Installation of cryptatools-core for rust is same for any OS. In Cargo.toml, just write:

[dependencies]
cryptatools-core = { git = "https://github.com/gogo2464/cryptatools-rs", package = 'cryptatools-core' }

Works on rust stable, unstable as well as nightly toolchains.

2-Python binding installation.

To install the python Bindings you can use pip or build from source:

2.1-Install python Bindings from pip:

The name cryptatools-python3 is the name of the package used to install cryptatools core python bindings. In order to install it, do:

pip install cryptatools-python3

It is updated of 1 subversion on each Pull Request and is then often update by the previous version.

2.2-Build Python Bindings from sources

If you are on windows, with powershell do:

python -m venv myenv
.\myenv\Script\activate
pip install setuptools wheel
git clone https://github.com/gogo2464/cryptatools-rs ;
cd cryptatools-rs
python .\cryptatools-core\setup.py bdist_wheel --verbose ;
$wheelFile = Get-ChildItem -Path .\dist\ -Recurse -Include * ;
pip3 install $wheelFile --force-reinstall ;

If you are on Linux, do:

virtualenv -p python3 myenv
source myenv/bin/activate
pip install setuptools wheel
git clone https://github.com/gogo2464/cryptatools-rs ;
cd cryptatools-rs
python3 ./cryptatools-core/setup.py bdist_wheel --verbose ;
pip3 install ./dist/* --force-reinstall ;

If you are on MacOs, do:

virtualenv -p python3 myenv
source myenv/bin/activate
pip install setuptools wheel
git clone https://github.com/gogo2464/cryptatools-rs ;
cd cryptatools-rs
python3 ./cryptatools-core/setup.py bdist_wheel --verbose ;
pip3 install ./dist/* --force-reinstall ;

3-cryptatools-cli the cli intreface

Crytptatools command line interface is split into various program in order to follow the Linux philosophy. To install each one, do:

git clone https://github.com/gogo2464/cryptatools-rs/ &&
cargo install --path .\cryptatools-rs\cryptatools-cli\ ;

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Document your work. See the how to make good documentation chapter.
For each method or object implemented, do not forget to make tests with doctest. See here.
Create Python binding. See this link.
Benchmark your work. See this link.
Create cli interface. See This link.
Open PR.

1-Documentation

The documentation is generated by doctstring with rustdoc. To edit the documentation, go to the code and modify the doctstring after ///.

Then, in order to generate documentation from root folder do:

cargo doc --all --no-deps

Fell free to view your own doc with

cargo doc --open --all --no-deps

This project uses cargo doc.

The documentation is self-generated on each pull request.

To check if the cryptatools documentation has been updated after a merge request, see: the API documentation.

2-Testing

In order to run unit tests, you MUST be in the directory cryptatools-rs. Run unit tests with doctests with the command:

cargo test --all

Unit test are made with doctests.

(back to top)

3-Create Python bindings.

Cryptatools relies on uniffi to provide bindings to Python3. Ensure to provide Python3 bindings before making your Pull Request.

To create Python Bindings, edit the file cryptatools-rs/src/cryptatools.udl. Edit it to add your newly created object as mentionned in the official uniffi documentation at this address: https://mozilla.github.io/uniffi-rs/udl_file_spec.html.

Then do not forget to edit the file cryptatools-rs/cryptatools-core/src/lib.rs and just before the:

uniffi_macros::include_scaffolding!("cryptatools");

This step will generate a single python file that you could import directly. Sadly the good pratice you must do to import these is a bit more complicated.

cryptatools-rs\cryptatools-core\setup.py

In order to import your own crate, create the corresponding python file or folders under cryptatools-rs\cryptatools-core\bindings\python3\cryptatools-core\cryptanalysis\. Here you need to import the necessary objects from from cryptatools_core.python3_bindings. Example, in the file cryptatools-rs\cryptatools-core\bindings\python3\cryptatools-core\cryptography\encryption\monoalphabetic_cipher\caesar_number.py we just have written: from cryptatools_core.python3_bindings import CaesarNumberAlgorithm.

Once this is done, fell free to write unit tests. At least one for each method implemented. The tests are writtenh in the file cryptatools-rs\cryptatools-core\binding-testing\testing.py.

You are now free to test and compile your code with the documentation at this link.

4-Benchmarking

Sometimes, a function could sepnd too much time. In this case, you can debug your specific function from the template in the file benches\cryptatools_benchmark\main.rs.

Then test your function with:

cargo bench

In the current state of cryptatools, it is not mandatory from a pull request to benchmark all the code. But could be considered as a good improvment.

5-Create cli.

Each feature from cryptatools-core will be implemented to the cli in the folder cryptatools-cli. When you will have finish to implement your feature in incryptatools-core and when you have finish your feature in Python bindings, fell free to create the cli of your modifications.

Then test it with:

cargo install --path .\cryptatools-cli\ ;

Usage

#![allow(unused)]
fn main() {
}

crypytatools-cli also offers a way to script the library in command line. Do not forget to install each program. See how to install these here for more informations.

cryptatools-cli-stats does cryptanalysis statistical attacks. It takes as first argument the statistical attack algorithm (example coincidence index, frequency analysis, etc...), as second argument the source encrypted opcodes to make attack on, at last but not least, it takes a json value of the corresponding alphabet corresponding to each set of opcodes.

On windows:

cryptatools-cli-stats frequency-analysis "123234" '{\"a\": [\"12\"], \"b\": [\"32\"], \"c\": [\"34\"]}'

On Linux:

cryptatools-cli-stats frequency-analysis "123234" '{"a": ["12"], "b": ["32"], "c": ["34"]}'

cryptatools-cli-encrypt uses cryptography algorithm to encrypt data. Obviously you can use it for brute force cryptanalysis attack. But it is not the main philosophy of cryptatools-rs.

For more examples, please refer to the Tutorial or to the documentation Documentation.

1 Intro

In order to connect with smart contracts, you might need to code in nodejs using cryptatools-rs bindings in order to interact with hardhat/web3js.

The full code example is avaible here: Click Here.

2 Create bindings

Sadly uniffi-rs does not currently implement javascript bindings, then we have to code these on our own using wasm-bindgen rust assembly tool. See this link.

./src/lib.rs

use cryptatools_core::utils::alphabets::Alphabet;
use cryptatools_core::cryptanalysis::custom::general_cryptanalysis_methods::hash_cryptanalysis::birthday_paradox::BirthdayParadox;

use wasm_bindgen::prelude::*;


#[wasm_bindgen(start)]
pub fn main() -> Result<(), JsValue> {
    // Use `web_sys`'s global `window` function to get a handle on the global
    // window object.
    let window = web_sys::window().expect("no global `window` exists");
    let document = window.document().expect("should have a document on window");
    let body = document.body().expect("document should have a body");

    // Manufacture the element we're gonna append
    let val = document.create_element("p")?;
    val.set_inner_html("Hello from Rust!");

    body.append_child(&val)?;

    Ok(())
}

#[wasm_bindgen]
pub fn birthday(block: &str) -> f64 {
    let hexadecimal_alphabet = Alphabet::hexadecimal_ascii_lowercase_sixteen_bits_alphabet();
    let bp = BirthdayParadox::new(hexadecimal_alphabet.into());
    let target_hash = block.as_bytes().to_vec();
    bp.calculate_birthday_paradox_expecting_percent_focusing_on_speed_with_taylor(target_hash.clone(), 0.50)
}

3 Call bindings

Then we have to call bindings.

index.js


import { birthday } from './pkg';

let attempts = birthday('71C7656EC7ab88b098defB751B7401B5f6d8976F');

console.log(attempts);

package.json

{
  "scripts": {
    "build": "webpack",
    "serve": "webpack serve"
  },
  "devDependencies": {
    "@wasm-tool/wasm-pack-plugin": "1.5.0",
    "html-webpack-plugin": "^5.3.2",
    "text-encoding": "^0.7.0",
    "webpack": "^5.88.2",
    "webpack-cli": "^4.7.2",
    "webpack-dev-server": "^4.15.1"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}

4 Configuration Files

Our code above can not work if we do not include the configuration files. Read:

webpack.config.js

const path = require('path');
const HtmlWebpackPlugin = require('html-webpack-plugin');
const webpack = require('webpack');
const WasmPackPlugin = require("@wasm-tool/wasm-pack-plugin");

module.exports = {
    entry: './index.js',
    output: {
        path: path.resolve(__dirname, 'dist'),
        filename: 'index.js',
    },
    plugins: [
        new HtmlWebpackPlugin(),
        new WasmPackPlugin({
            crateDirectory: path.resolve(__dirname, ".")
        }),
        // Have this example work in Edge which doesn't ship `TextEncoder` or
        // `TextDecoder` at this time.
        new webpack.ProvidePlugin({
          TextDecoder: ['text-encoding', 'TextDecoder'],
          TextEncoder: ['text-encoding', 'TextEncoder']
        })
    ],
    mode: 'development',
    experiments: {
        asyncWebAssembly: true
   }
};

Use cryptanalysis attack method against a caesar encryption shellcode for Malware analysis with cryptatools and radare2 rust bindings (r2pipe)

1-Introduction

Knowledge required:

cryptanalysis on caesar cipher
x86_32 assembly language programming. Check your language here.

Shellstorm is a pentester ressource. They provide shellcode for cyber security researcher. Sadly their shellcode could be used by hackers in order to create computer trojan. Hackers could include the shellcode in an image or an executable to add trojan during the execution of the file.

One of these shellcode uses caesar to obfuscate his signature against anti-virus.

Today we are going to use cryptatools to break the caesar encryption algorithm to deobfuscate a malware for reverse engineering purpose. We will explore the different way than brute force to break this caesar encryption and we will break ceasar at least.

2-Reverse Engineering the shellcode

We want to see the machine code of the shellcode in order to see were the data is stocked and to do reverse engineering in order to ensure the encryption algorithm.

In the case of this analysis, we have access to the Shellstorm shellcode sample link here. So we already have the opcode of the shellcode in the format \x12\x34. So let's copy the opcodes \xeb\x25\x5e\x31\xc9\xb1\x1e\x80\x3e\x07\x7c\x05\x80\x2e\x07\xeb\x11\x31\xdb\x31\xd2\xb3\x07\xb2\xff\x66\x42\x2a\x1e\x66\x29\xda\x88\x16\x46\xe2\xe2\xeb\x05\xe8\xd6\xff\xff\xff\x38\xc7\x57\x6f\x69\x68\x7a\x6f\x6f\x69\x70\x75\x36\x6f\x36\x36\x36\x36\x90\xea\x57\x90\xe9\x5a\x90\xe8\xb7\x12\xd4\x87 to the file shellcode.txt.

Then we are now going to compile the shellcode in order to disassemble it.

First parse it to get an opcode format such as format 1234:

echo "$(cat shellcode.txt | tr -d 'x' | tr -d '\\' | tr -d '\n')" > opcode.txt

Then let's compile it really:

xxd -r -p opcode.txt bin

The file bin is created. It is the assembled shellcode. We could as well run it with:

chmod u+x ./bin
./bin

But our goal is to break it. Not to let it infect us. So let's disassemble it with radare2:

radare2 -a x86 -b 32 ./bin

radare2 is a reverse engineering framwork. We could use it to read the shellcode machine code. The option -a x86 tells radare2 to disassemble an x86 processor machine code. The option -b 32 tells radare2 to read 32 bit code. We could use -AA to provide as much information as possible about the binary. We prefer manual documentation in this tutorial.

Now in the radare2 console, type Vp to switch to the disassembled machine code, then you can read the code of the shellcode.

jmp 0x27
pop esi
xor ecx, ecx
mov cl, 0x1e
cmp byte [esi], 7
jl 0x11
sub byte [esi], 7
jmp 0x22
xor ebx, ebx
xor edx, edx
mov bl, 7
mov dl, 0xff
inc dx
sub bl, byte [esi]
sub dx, bx
mov byte [esi], dl
inc esi
loop 7
jmp 0x2c
call 0x02
cmp bh, al
push edi
outsd dx, dword [esi]
imul ebp, dword [eax + 0x7a], 0x70696f6f
jne 0x6f
outsd dx, dword [esi]
nop
ljmp 0xe890
mov bh, 0x12
aam 0x87

Let's add some labels to document it:

_start:
    jmp call_decoder
decoder:
    pop esi
    xor ecx, ecx
    mov cl, 0x1e
decode:
    cmp byte [esi], 7
    jl lowbound
    sub byte [esi], 7
    jmp 0x22
lowbound:
    xor ebx, ebx
    xor edx, edx
    mov bl, 7
    mov dl, 0xff
    inc dx
    sub bl, byte [esi]
    sub dx, bx
    mov byte [esi], dl
common_command:
    inc esi
    loop 7
    jmp shellcode
call_decoder:
    call decoder
shellcode:
    cmp bh, al
    push edi
    outsd dx, dword [esi]
    imul ebp, dword [eax + 0x7a], 0x70696f6f
    jne 0x6f
    outsd dx, dword [esi]
    nop
    ljmp 0xe890
    mov bh, 0x12
    aam 0x87

Let's read the assembly code of the shellcode!

The instruction jmp call_decoder go to the label call_decoder. Then the instruction call decoder is called. In order to really understsand the shellcode we have to dig deeply on what the instruction call decoder does. Even if you know assembly language, do not miss this step.

In x86 assembly language, the instruction call does not only jump to the label. It pushes the next instruction address to the stack and then, when the instruction left, uses a pop eip instruction to come back to the next instruction.

In shellcoding, it is a very intresting feature because, in this shellcode, the address of the label shellcode is pushed to the stack.

decoder:
    pop esi

As first instruction of the called function we could see a very tricky and interesting property of the assembly language. The shellcode address is poped on the esi!!!

cl is the register that stores the size of the encrypted content. It is then 0x1e.

Let's read the encrypted content:

p8 0x1e @ 0x2c ;
38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487

38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487 is the cipher text.

Now that we know where the shellcode is (at the label shellcode), we will break it using in a first time only cryptatools cryptanalysis method. Then we will explore each way to solve the problem. We will start by the most cryptanalystic method and finish with the less cryptanalystic method.

3.1 Caesar statistical analysis: Try when the shellcode size is too little.

The global idea is to see at which frequency each opcode appears in the code and then compare with the frequency distribution of plain text opcodes in a shellcode database or a malware database as well as a goodware database (trough less accurate).

According to the work of Babak Bashari Rad, we already have the statistical distribution for main opcodes in x86:

See his work here.

So let's calculate frequency distribution in order to compare the plain with the cipher text and see similarities.

#![allow(unused)]
fn main() {
[package]
name = "caesar_shellcode_1_statistical_analysis"
version = "0.1.0"
edition = "2021"

See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
cryptatools-core = { git = "https://github.com/gogo2464/cryptatools-rs", package = 'cryptatools-core' }
serde_json = "1.0.91"
r2pipe = { git = "https://github.com/RHL120/r2pipe.rs", branch = "windows_bad" }
itertools = "0.10.5"
}

use r2pipe::R2Pipe;
use r2pipe::open_pipe;
use cryptatools_core::utils::alphabets::Alphabet;
use cryptatools_core::cryptanalysis::general_cryptanalysis_methods::frequency_analysis::distribution_algorithms::statistical::Statistical;
use std::u8;

fn read_plain_text(cipher_text: String) -> Vec<u8> {
  let mut bytes = Vec::new();
  for o in (0..cipher_text.len()).step_by(2) {
	let left = cipher_text.chars().nth(o).unwrap();
	let right = cipher_text.chars().nth(o+1).unwrap();
	let mut opcode = String::from(left);
	opcode.push(right);
	bytes.push(u8::from_str_radix(&opcode, 16).unwrap());
  }	
  
  bytes
}

fn main() {
  let mut r2p = open_pipe!(Some("bin")).unwrap();
  let mut cipher_text = String::from(r2p.cmd("p8 0x1e @ 0x2c ;").unwrap());
  cipher_text.remove(cipher_text.len()-1);
  cipher_text.remove(cipher_text.len()-1);
  
  println!("cipher text: {:?}", cipher_text);
  
  let unknow_opcode_alphabet = Alphabet::new_empty().unknow_opcodes();
  
  let bytes = read_plain_text(cipher_text);
  
  let stat = Statistical::new(unknow_opcode_alphabet.clone());
  let stat_percentage = stat.guess_statistical_distribution(bytes);
  
  for character in stat_percentage {
	  for opcode in character.0 {
		  println!("opcode {:x}, statistic: {:?}", opcode, character.1);
	  }
	  
  }
  
  r2p.close();
}

Let's run it with:

cargo run
   Compiling caesar_shellcode_1_statistical_analysis v0.1.0 (C:\Users\TRUNCATED\Desktop\cryptatools-rs\tutos\caesar_shellcode_1_statistical_analysis)
    Finished dev [unoptimized + debuginfo] target(s) in 1.15s
     Running `target\debug\caesar_shellcode_1_statistical_analysis.exe`
cipher text: "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487"
opcode e9, statistic: 0.03333333333333333
opcode e8, statistic: 0.03333333333333333
opcode 5a, statistic: 0.03333333333333333
opcode b7, statistic: 0.03333333333333333
opcode 36, statistic: 0.16666666666666666
opcode ea, statistic: 0.03333333333333333
opcode 57, statistic: 0.06666666666666667
opcode 6f, statistic: 0.13333333333333333
opcode 69, statistic: 0.06666666666666667
opcode 87, statistic: 0.03333333333333333
opcode 38, statistic: 0.03333333333333333
opcode 12, statistic: 0.03333333333333333
opcode c7, statistic: 0.03333333333333333
opcode 68, statistic: 0.03333333333333333
opcode 75, statistic: 0.03333333333333333
opcode 70, statistic: 0.03333333333333333
opcode 90, statistic: 0.1
opcode 7a, statistic: 0.03333333333333333
opcode d4, statistic: 0.03333333333333333

The most frequent opcodes are: 36, 6f and 90.

These could be either: mov, push or call.

3.1.1.1 Is 36 a mov (89)?

Let's manually try to replace 36 by mov and see:

rasm2 -a x86 -b 32 "mov eax, eax"
89c0

so let's replace 36 by 89.

$text = "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487" ; $text = $text -replace "36","89"
rasm2 -a x86 -b 32 -D $text

0x00000000   2                     38c7  cmp bh, al
0x00000002   1                       57  push edi
0x00000003   1                       6f  outsd dx, dword [esi]
0x00000004   7           69687a6f6f6970  imul ebp, dword [eax + 0x7a], 0x70696f6f
0x0000000b   2                     7589  jne 0xffffff96
0x0000000d   1                       6f  outsd dx, dword [esi]
0x0000000e   6             8989898990ea  mov dword [ecx - 0x156f7677], ecx
0x00000014   1                       57  push edi
0x00000015   1                       90  nop
0x00000016   5               e95a90e8b7  jmp 0xb7e89075
0x0000001b   2                     12d4  adc dl, ah
0x0000001d   1                       87  invalid

We have an invalid opcode. We can only be wrong.

3.1.1.2 Is 36 a push (50) ?

Let's manually try to replace 36 by push and see:

rasm2 -a x86 -b 32 "push eax"
50

As I guess there could be many opcode for push dependig of the second operand. We will use guess it is a call of we do not find better before.

3.1.1.3 Is 36 a call (e8)?

rasm2 -a x86 -b 32 "call 0x00"
e8fbffffff

$text = "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487" ; $text = $text -replace "36","89"
rasm2 -a x86 -b 32 -D $text
0x00000000   2                     38c7  cmp bh, al
0x00000002   1                       57  push edi
0x00000003   1                       6f  outsd dx, dword [esi]
0x00000004   7           69687a6f6f6970  imul ebp, dword [eax + 0x7a], 0x70696f6f
0x0000000b   2                     7589  jne 0xffffff96
0x0000000d   1                       6f  outsd dx, dword [esi]
0x0000000e   6             8989898990ea  mov dword [ecx - 0x156f7677], ecx
0x00000014   1                       57  push edi
0x00000015   1                       90  nop
0x00000016   5               e95a90e8b7  jmp 0xb7e89075
0x0000001b   2                     12d4  adc dl, ah
0x0000001d   1                       87  invalid

Invalid. It is a fail!

3.1.1.4 Is 36 another push ?

Now we have eliminated any other possibility, we know that 36 is probably register or stack data. let's see latter.

3.1.2.1 is `6f` a `mov`, `push` or a `call`. Let's try a `mov`

rasm2 -a x86 -b 32 "mov eax, eax"
89c0

$text = "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487" ; $text = $text -replace "6f","89"
rasm2 -a x86 -b 32 -D $text
0x00000000   2                     38c7  cmp bh, al
0x00000002   1                       57  push edi
0x00000003   3                   896968  mov dword [ecx + 0x68], ebp
0x00000006   2                     7a89  jp 0xffffff91
0x00000008   3                   896970  mov dword [ecx + 0x70], ebp
0x0000000b   2                     7536  jne 0x43
0x0000000d   2                     8936  mov dword [esi], esi
0x0000000f   4                 36363690  nop
0x00000013   7           ea5790e95a90e8  ljmp 0xe890:0x5ae99057
0x0000001a   2                     b712  mov bh, 0x12
0x0000001c   2                     d487  aam 0x87

Seems to be possible. But the jp instruction is weird. Let's continue to see.

3.1.2.2 is `6f` a `mov`, `push` or a `call`. Let's try a `push`

rasm2 -a x86 -b 32 "push 0x12345678"
6878563412

$text = "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487" ; $text = $text -replace "6f","68"
rasm2 -a x86 -b 32 -D $text
0x00000000   2                     38c7  cmp bh, al
0x00000002   1                       57  push edi
0x00000003   5               6869687a68  push 0x687a6869
0x00000008   5               6869707536  push 0x36757069
0x0000000d   5               6836363636  push 0x36363636
0x00000012   1                       90  nop
0x00000013   7           ea5790e95a90e8  ljmp 0xe890:0x5ae99057
0x0000001a   2                     b712  mov bh, 0x12
0x0000001c   2                     d487  aam 0x87

Seems more than likely possible. We remember the values 0x687a6869, 0x36757069 and 0x36363636 as data for future analysis.

3.1.3.1 is `90` a `mov`, `push` or a `call`. Let's try a `push`.

rasm2 -a x86 -b 32 "mov eax, eax"
89c0

$text = "38c7576f69687a6f6f697075366f3636363690ea5790e95a90e8b712d487" ;
$text = $text -replace "6f","68"
$text = $text -replace "90","89"
rasm2 -a x86 -b 32 -D $text
0x00000000   2                     38c7  cmp bh, al
0x00000002   1                       57  push edi
0x00000003   5               6869687a68  push 0x687a6869
0x00000008   5               6869707536  push 0x36757069
0x0000000d   5               6836363636  push 0x36363636
0x00000012   2                     89ea  mov edx, ebp
0x00000014   1                       57  push edi
0x00000015   2                     89e9  mov ecx, ebp
0x00000017   1                       5a  pop edx
0x00000018   2                     89e8  mov eax, ebp
0x0000001a   2                     b712  mov bh, 0x12
0x0000001c   2                     d487  aam 0x87

The shellcode sample is too little... We can not go more far...

3.2 Caesar brute force

After reverse engineering of the shellcode decryptror, we now know that the encryption algorithm is caesar. So let's implement a brute forcer to decrypt caesar algorithm.

The real challenge is, how to detect plain text x86 32 bits opcodes?

The first idea is to disassemble the code and see if no instruction is invalid. Sadly this method is not perfect. the 8051 has 255 instructions! Even this x86_32 sample encrypted shellcode has not any invalid instruction.

Whe can then combine it to two tools: 1- the frequency analysis as previously seen. The idea is to see if the decrypted opcodes have the same repartition as plain text. 2- the coincidence index. Same concept but also considere the repartition with the previous instructions.

cargo new caesar_shellcode_2_caesar_brute_forcer

Cargo.toml

#![allow(unused)]
fn main() {
[package]
name = "caesar_shellcode_2_caesar_brute_forcer"
version = "0.1.0"
edition = "2021"

See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
cryptatools-core = { git = "https://github.com/gogo2464/cryptatools-rs", package = 'cryptatools-core' }
serde_json = "1.0.91"
r2pipe = { git = "https://github.com/RHL120/r2pipe.rs", branch = "windows_bad" }
itertools = "0.10.5"
}

main.rs

use r2pipe::R2Pipe;
use r2pipe::open_pipe;
use cryptatools_core::utils::alphabets::Alphabet;
use cryptatools_core::cryptanalysis::general_cryptanalysis_methods::frequency_analysis::distribution_algorithms::statistical::Statistical;
use cryptatools_core::cryptography::classical::encryption::monoalphabetic_ciphers::caesar_number::CaesarNumberAlgorithm;
use std::u8;
use std::fmt::Write;

fn read_plain_text(cipher_text: String) -> Vec<u8> {
  let mut bytes = Vec::new();
  for o in (0..cipher_text.len()).step_by(2) {
	let left = cipher_text.chars().nth(o).unwrap();
	let right = cipher_text.chars().nth(o+1).unwrap();
	let mut opcode = String::from(left);
	opcode.push(right);
	bytes.push(u8::from_str_radix(&opcode, 16).unwrap());
  }	
  
  bytes
}

fn convert_u8_to_text(u8_vector: Vec<u8>) -> String {
  let mut string = String::new();
  for num in u8_vector {
	if num >= 0x0f {
		write!(&mut string, "{num:x}");
	} else {
		write!(&mut string, "0{num:x}");
	}
  }
  
  string
}

fn is_plain_text(probably_cipher_text: Vec<u8>) -> bool {
	  let mut r2p = open_pipe!(Some("-")).unwrap();
	  
	  r2p.cmd("e anal.arch=x86 ; e asm.bits=32 ;").unwrap();
	  
	  let string_probably_cipher_text = convert_u8_to_text(probably_cipher_text.clone());
	  let mut cmd = String::from("");
	  write!(&mut cmd, "wx {string_probably_cipher_text} ;");
	  
	  r2p.cmd(&cmd).unwrap();
	  let instructions = String::from(r2p.cmd("pI 0x1e @ 0x00 ;").unwrap());
	  
	  println!("instructions: {:?}", instructions);
	  
	  if instructions.find("invalid").is_none() == false {
		return false;  
	  }
	
	  let unknow_opcode_alphabet = Alphabet::new_empty().unknow_opcodes();
	  let stat = Statistical::new(unknow_opcode_alphabet.clone());
	  let stat_percentage = stat.guess_statistical_distribution(probably_cipher_text);
	  
	  for character in stat_percentage {
		  for opcode in character.0 {
			  println!("opcode {:x}, statistic: {:?}", opcode, character.1);
			  if opcode == 0x89 && character.1 > 20.0/100.0 && character.1 < 40.0/100.0 { //mov
			      return false;
		      }
		  }
	  }
	  
	  true
}

fn main() {
  let mut r2p = open_pipe!(Some("bin")).unwrap();
  let mut cipher_text = String::from(r2p.cmd("p8 0x1e @ 0x2c ;").unwrap());
  cipher_text.remove(cipher_text.len()-1);
  cipher_text.remove(cipher_text.len()-1);
  
  println!("cipher text: {:?}", cipher_text);
  
  let unknow_opcode_alphabet = Alphabet::new_empty().unknow_opcodes();
  let bytes = read_plain_text(cipher_text);
  let c = CaesarNumberAlgorithm::new(unknow_opcode_alphabet.into());
  let mut key = 0;
  let mut plain_text_found = false;
  let is_decrypted = c.decrypt_by_opcode_shift(bytes, key);
  key += 1;
  while plain_text_found == false {
	  let is_decrypted = c.decrypt_by_opcode_shift(is_decrypted.clone(), key);
	  plain_text_found = is_plain_text(is_decrypted.clone());
	  if plain_text_found == true {
		  break;
      }
	  key += 1;
  }
  
  println!("plain text:  {:?} decrypted with key {:?}", is_decrypted, key);
  r2p.close();
}

3.3 Caesar breaking using reverse engineering analaysis and implementing a decryptor with cryptatools.

Now, let's determine the algorithm using reverse engineering method!

Sometimes assembly language could be hard to read and reverse due to counter-reverse engineering protections. It was not the case of this protection. Then let's continue to dig the reverse engineering.

decode:
    cmp byte [esi], 7
    jl lowbound
    sub byte [esi], 7
    jmp 0x22
lowbound:
    xor ebx, ebx
    xor edx, edx
    mov bl, 7
    mov dl, 0xff
    inc dx
    sub bl, byte [esi]
    sub dx, bx
    mov byte [esi], dl
common_command:
    inc esi
    loop 7

After the reverse engineering, we already know that the caesar key is 7 from the instruction loop 7. Let's create a decryptor from cryptatools.

cargo new caesar_shellcode_decryptor ;
cd caesar_shellcode_decryptor ;

Let's edit the Cargo.toml:

#![allow(unused)]
fn main() {
[package]
name = "caesar_shellcode_decryptor"
version = "0.1.0"
edition = "2021"

See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
cryptatools-core = { git = "https://github.com/gogo2464/cryptatools-rs", package = 'cryptatools-core' }
serde_json = "1.0.91"
r2pipe = { git = "https://github.com/RHL120/r2pipe.rs", branch = "windows_bad" }
itertools = "0.10.5"
}

Now edit the main.rs

use r2pipe::R2Pipe;
use r2pipe::open_pipe;
use cryptatools_core::cryptography::classical::encryption::monoalphabetic_ciphers::caesar_number::CaesarNumberAlgorithm;
use cryptatools_core::utils::alphabets::Alphabet;
use std::u8;
use itertools::Itertools;
use std::fmt::Write;

fn read_plain_text(cipher_text: String) -> Vec<u8> {
  let mut bytes = Vec::new();
  for o in (0..cipher_text.len()).step_by(2) {
	let left = cipher_text.chars().nth(o).unwrap();
	let right = cipher_text.chars().nth(o+1).unwrap();
	let mut opcode = String::from(left);
	opcode.push(right);
	bytes.push(u8::from_str_radix(&opcode, 16).unwrap());
  }	
  
  bytes
}


fn main() {
  let mut r2p = open_pipe!(Some("bin")).unwrap();
  let mut cipher_text = String::from(r2p.cmd("p8 0x1e @ 0x2c ;").unwrap());
  cipher_text.remove(cipher_text.len()-1);
  cipher_text.remove(cipher_text.len()-1);
  
  println!("cipher text: {:?}", cipher_text);
  
  let unknow_opcode_alphabet = Alphabet::new_empty().unknow_opcodes();
  let mut c: CaesarNumberAlgorithm = CaesarNumberAlgorithm::new(unknow_opcode_alphabet.into());
  
  let bytes = read_plain_text(cipher_text);
  let decrypted = c.decrypt_by_opcode_shift(bytes, 7);

  let mut string = String::new();
  for num in decrypted {
	if num >= 0x0f {
		write!(&mut string, "{num:x}");
	} else {
		write!(&mut string, "0{num:x}");
	}
  }
  
  println!("plain text : {:?}", string);
  
  r2p.close();
}

$ rasm2 -a x86 -b 32 -D "31c05068626173686862696e2f682f2f2f2f89e35089e25389e1b00bcd80"
0x00000000   2                     31c0  xor eax, eax
0x00000002   1                       50  push eax
0x00000003   5               6862617368  push 0x68736162
0x00000008   5               6862696e2f  push 0x2f6e6962
0x0000000d   5               682f2f2f2f  push 0x2f2f2f2f
0x00000012   2                     89e3  mov ebx, esp
0x00000014   1                       50  push eax
0x00000015   2                     89e2  mov edx, esp
0x00000017   1                       53  push ebx
0x00000018   2                     89e1  mov ecx, esp
0x0000001a   2                     b00b  mov al, 0xb
0x0000001c   2                     cd80  int 0x80
> radare2 --
 -- In radare we trust
[0x00000000]> ? 0x68736162
segment 6873000:6162
string  "bash"
[0x00000000]> ? 0x2f6e6962
string  "bin/"
[0x00000000]> ? 0x2f2f2f2f
string  "////"

3.4 Caesar breaking using only reverse engineering analaysis by debugging. No cryptanalysis.

In this part we are going to debug the full shellcode in order to decrypt it. We need to have a Linux OS now.

radare2 -a x86 -b 32 -e esil.romem=true -e emu.write=true -e io.cache=true -s 0x0 -c "aei; aeip; aeim 0xffffd000 0x2000 stack; aesu 0x25; Vp" bin

> i | grep size
size     0x4a
> pI 0x4a @ 0x00
jmp 0x27
pop esi
xor ecx, ecx
mov cl, 0x1e
cmp byte [esi], 7
jl 0x11
sub byte [esi], 7
jmp 0x22
xor ebx, ebx
xor edx, edx
mov bl, 7
mov dl, 0xff
inc dx
sub bl, byte [esi]
sub dx, bx
mov byte [esi], dl
inc esi
loop 7
jmp 0x2c
call 2
xor eax, eax
push eax
push 0x68736162
push 0x2f6e6962
push 0x2f2f2f2f
mov ebx, esp
push eax
mov edx, esp
push ebx
mov ecx, esp
mov al, 0xb
int 0x80

The decrypted shellcode:

xor eax, eax
push eax
push 0x68736162
push 0x2f6e6962
push 0x2f2f2f2f
mov ebx, esp
push eax
mov edx, esp
push ebx
mov ecx, esp
mov al, 0xb
int 0x80

absolutely correspond to what we found using cryptanalysis!!!

4 Conclusion

Break caesar algorithm with statistic is trivial on plain text. It is different when the executables are encrypted because we have to deal with the opcode encryption as well as detect plain text assembly languages. Trough in this case, the esiest way was to implement a decryptor after reverse engineering the key, it is sometimes impossible in the case of a malware that deletes the key or provide access to the key only from a remote server that could be disconnected.

Hopefully, the byte parsing system of cryptatools allow to run cryptanalysis attack against these bytes as well as recognize plain text from bytes.

Once the plain text is found, we could use radare2 emulation to emulate and run a single function: the one with plain text opcodes.

Sources and references

Shellcode coding tutorial

Shellstorm shellcode sample

Full source code of this course.

Evaluating Ethereum address collision probabilities with birthday paradox method using ether-rs and cryptatools-rs

I - Intro And Challenges

1. Intro

Ethereum is a decentralized cryptocurrency. Each people has his own wallet authenticated with a public/private keyset. As ethereum is decentralized, if someone create the adress of somebody else, then he will be able to spoof his identity. Hopefully addresses are random, then we need to create a lot of identities.

But how many identities do we need to create? A paper at this address here has already answered to this question.

Today we are going to implement a cryptanalys attack with cryptatools to automatically know the attempts required instead of calculating manually.

2. Challenges And Terms

In the origin, the Birtday Paradox Attack is an attack where the attackant wants to check if the has output is tall enough hash output in order to find collision using pure brute force. It is effective in web2 hashing algorithm in order to know if collision are possible on hash algorithm. This attack supposes that the hash output generation is purely perfectly random. Tough this methd seems very very brute, the birtday paradox uses math probabilities in order to evaluate if a hash outpute is long enough.

Today we are going to implement a birtday paradox evalution on bitcoin cryptocurrency using cryptatools-rs, we are able to use birtday paradox, not in order to run hash attack but to evalutate how many time we will need to try in order to have 50% of chances to reach a collision between two wallets.

II let's do it

1. Fecthing A Random Author Adress

We need to fetch a random hash in order to check if the hash is strong enough. We will use the library ether-rs:

#![allow(unused)]
fn main() {
use ethers::prelude::*;
}

then let's use it in order to really fetch the block:

use ethers::prelude::*;

const WSS_URL: &str = "wss://mainnet.infura.io/ws/v3/c60b0bb42f8a4c6481ecd229eddaca27";

#[tokio::main]
async fn main() -> eyre::Result<()> {    
    let provider = Provider::<Ws>::connect(WSS_URL).await?;
    let mut stream = provider.subscribe_blocks().await?.take(1);
    let mut wallet_block: Option<Vec<u8>> = None;
    while let Some(block) = stream.next().await {
        if let Some(author) = block.author {
            wallet_block = Some(author.0.to_vec());

            println!("random hash author: {:?}", author);
        }
    }

    Ok(())
}

2. Configuring cryptatools-rs

In cryptatools-rs, the alphabet is a concept that applies not only for letters but also to any format. We need to tells cryptatools-rs to use hexadecimal alphabet because it will calculate the length in bits of the alphabet in order to occur the birthday paradox with the right number of bits.

We need to select the alphabet. We will use full_hexadecimal_alphabet struct because it is a struct that contains 255 values.

#![allow(unused)]
fn main() {
let hexadecimal_alphabet = Alphabet::new_empty().full_hexadecimal_alphabet();
let mut bp = BirtdayParadox::new(hexadecimal_alphabet.into());
}

Now we will need to call the method calculate_birtday_paradox_expecting_percent_focusing_on_speed_with_taylor.

This is a method that is speed enough to be calculated on more than let's say 10 characters hash. This is sadly less precise than the method calculate_birtday_paradox_expecting_percent_focusing_on_precision that we will not user here.

Let's write:

#![allow(unused)]
fn main() {
bp.calculate_birtday_paradox_expecting_percent_focusing_on_speed_with_taylor(target_hash.clone(), 0.50)
}

use ethers::prelude::*;

use cryptatools_core::utils::{convert::Encode, alphabets::split_bytes_by_characters_representation, alphabets::Alphabet, alphabets::Encoding};
use cryptatools_core::cryptanalysis::general_cryptanalysis_methods::frequency_analysis::coincidence_index::CoincidenceIndexGuesser;
use cryptatools_core::cryptanalysis::general_cryptanalysis_methods::hash_cryptanalysis::birthday_paradox::BirtdayParadox;

const WSS_URL: &str = "wss://mainnet.infura.io/ws/v3/c60b0bb42f8a4c6481ecd229eddaca27";

#[tokio::main]
async fn main() -> eyre::Result<()> {
    let hexadecimal_alphabet = Alphabet::new_empty().full_hexadecimal_alphabet();
    let bp = BirtdayParadox::new(hexadecimal_alphabet.into());
    
    let provider = Provider::<Ws>::connect(WSS_URL).await?;
    let mut stream = provider.subscribe_blocks().await?.take(1);
    let mut wallet_block: Option<Vec<u8>> = None;
    while let Some(block) = stream.next().await {
        if let Some(author) = block.author {
            wallet_block = Some(author.0.to_vec());
        }
    }

    if let Some(wallet_block) = wallet_block {
        let target_hash = wallet_block;
        println!("after {:?} attempts, there is 50% of chances to get a collision on ethereum addresses.", bp.calculate_birtday_paradox_expecting_percent_focusing_on_speed_with_taylor(target_hash.clone(), 0.50));
    } else {
        println!("Error: wallet not found. Check your internet connection.");
    }

    Ok(())
}

4. Running a Birtday paradox evaluation.

In the paper they mentionned:

For e = 50%, this gives n = 1.41 × 10^24.

Let's run...

After 1.4234013764919992e24 attempts, there is 50% of chances to get a collision on ethereum addresses.

Sounds perfect! It is exactly the same answer than the paper!

Of course there is a difference of 0.01e24 but the birthday paradox is an approximation.

Conclusion

ethereum wallet hash and alphabet length provide same resistance as the bitcoin.

We do not have to worry about the probability to find a wallet collision. It is very safe!

The source code of this course is here.

The project cryptatools-core is a well documented project. Each element of the library should have been documented in the API documentation.

See the API Reference at this address: API-Ref

Why Doing Cryptanalysis

1 - What Is Cryptanalysis?

Cryptography is a way to make an information impossible to understand from an attackant (generally not legitimate to read the information) that wants to read it.

Cryptography is often done by compagnies or end users who wants to protect these informations such as password, private conversation out of scope of an adversarial spy.

Cryptanalysis is then the method to break cryptographic algorithm.

2 - Goals Of Cryptanalysis

Sometimes, the person who wants to read information is not illegitimate.

2.1 Malware Analysis

2.1.1 - Ransomware Decryption

It could be the case if an hacker has encrypted computer files of a compagny in order to ask for a ransom. If the files are still present on the disk, then the cyber security analyst could make reverse engineering of the ransomware executable to determine the encryption algorithm and then write a script with cryptatools-rs in order to attack the encryption algoritm of the malware to decrypt the encrypted files of the victim with no paying the ransom to the malicious hacker.

2.1.2 - Antivirus Updating

Malware are malicious softwares errorly know as "virus" (as virus designated more specific malware that necessaily spreads over files).

Sometimes malwares are encrypted. It could be done by malicious malware writter to bypass the anti-virus or in order to stop a malware analyst to know how the malware works to stop the anti-virus developper from updating the anti virus.

2.2 Bug Bounty And Pentest

2.2.1 Bug Bounty

Softwares need to use encryption to stop an attackant to listen to them. They may use Bug Bounty to ask searchers to find vulnerabilities and reward them legally to fix them. It is a way to proove the security of a software openly to users.

Some are:

2.2.1.1 On Network Encryption

When network is intercepted on the wifi/4G/5G or any way, you could try to code cryptanalysis exploits on software encryption in order to decipher traffic.

2.2.1.2 On Blockchain

There could be huge vulnerabilities on the blockchain if the blockchain is not codded correctly. As example:

If the hash algorithm contains collisions, we could forge a fake signature and take over any account of the blockchain and steal all the money.
If a bad random number generator is present on the blockchain, we could guess all the RSA private key of any future users. It implies that we could steal all their money.
And so on...

See this reference to have a non exhaustive list of blockchain possible cyber attacks.

2.2.2 Pentest

Once an exploit is Public or has been reached by your pentest compagny, you can use it after having signed a contract in order to secure the compagny from cyber attacks. This process is know as pentest.

In the case of a pentest, each cryptanalysis exploit could be used to proove the lack of security of a network.

2.3 Cryptographic Research

In order to test a cryptographic algorithm, if you are researcher, you could need to test your cryptographic/cryptanalytic algorithm.

2.4 - Secrets Agencies

Each modern gouvernement use hacking as cyberwar method today. They may use electronic war method forbidden in the civil that we will not discuss here in order to steal information to other governments to watch them and prevent economical or militiry war. So even the most offensive cryptanalysis method could be done for them. This may include:

Listenning to encrypted wifi over a hacked wifi (example cache poisonning).
Listenning to 4/5G in order to interecept encrypted communication of smart phones.

Even if you have no contract with compagny, you can sell your exploit to a reseller of your governemental secret agency legally with zerodium.

zerodium is a kind of bug bounty platform for zero day exploit. Zerodium will exploit the vulnerability instead of fix it.