Received: 6 June 2024
Accepted: 18 October 2024
Available: 12 November 2024
The exchange of large amounts of information through public channels has become an everyday occurrence, a situation that generates great risks in the case of possible cyber-attacks and motivates the academic and scientific community to develop new robust security schemes. The objective of the research was to use mathematical and artificial intelligence tools to propose new security schemes. The design and implementation of a crypto-steganographic algorithm for text is described below. The methodology employed consisted of using cellular automata to detect the edges of a carrier image, leveraging the color contrast diversity, and the Tinkerbell chaotic attractor to generate two pseudo-random sequences: one for the encryption scheme and the other to select the edge pixels of the carrier image where cryptogram bits are hidden. Additionally, a verification phase was included in which the receiver provides a code to confirm that the stegoimage was not altered. The system key is shared between the sender and the receiver using the Diffie-Hellman algorithm. The proposed algorithm was subjected to a series of steganographic and cryptographic performance tests, including entropy analysis, mean square error (MSE), correlation coefficients, key sensitivity, peak signal-to-noise ratio (PSNR), normalized root mean square error (NRMSE), and the structural similarity index (SSI). The results of PSNR, MSE and SSI test were compared with scientific benchmarks, revealing indicators that align with the standards of information security. Finally, a crypto-steganographic algorithm was consolidated as a result of an academic exercise whose indicators make it potentially applicable in real-world contexts.
Keywords:Security, cellular automata, chaos, cryptography, image edge detection, steganography.
El intercambio de grandes cantidades de información a través de canales públicos se ha convertido en algo cotidiano, situación que genera grandes riesgos ante posibles ciberataques y motiva a la comunidad académica y científica a desarrollar nuevos esquemas robustos de seguridad. El objetivo de la investigación fue utilizar herramientas de matemáticas e inteligencia artificial para proponer nuevos esquemas de seguridad. A continuación, se describe el diseño e implementación de un algoritmo cripto-esteganográfico para texto. La metodología empleada consistió en usar autómatas celulares para detectar los bordes de una imagen portadora, aprovechando la diversidad de contrastes de color, así como el atractor caótico Tinkerbell para generar dos secuencias pseudoaleatorias: una para el esquema de cifrado y otra para seleccionar los píxeles de los bordes de la imagen portadora, donde se ocultan los bits del criptograma. Además, se incluyó una fase de verificación, en la que el receptor proporciona un código para confirmar que la imagen stego no fue manipulada. La clave del sistema se comparte entre el emisor y el receptor mediante el algoritmo Diffie-Hellman. El algoritmo propuesto se sometió a una serie de pruebas de rendimiento esteganográfico y criptográfico, tales como el análisis de entropía, el error cuadrático medio (MSE), los coeficientes de correlación, la sensibilidad de la clave, la relación señal-ruido máxima (PSNR), el error cuadrático medio normalizado (NRMSE) y el índice de similitud estructural (SSI). Los resultados de las pruebas PSNR, MSE y SSI, se compararon con referencias científicas, revelando indicadores que se ajustan a los estándares de seguridad de la información. Finalmente, se consolidó un algoritmo criptoesteganográfico resultado de un ejercicio académico cuyos indicadores lo convierten en potencial de aplicación en contextos del mundo real.
Palabras clave:Seguridad, autómatas celulares, caos, criptografía, detección de bordes, esteganografía.
Advances in telecommunications, coupled with global population growth have boosted the exponential increase in Internet-connected devices, leading to the widespread adoption of new forms of communication accessible to the majority of the population. This growth was further accelerated by the spread of the SARS-CoV-2 virus. This trend is highlighted in the annual reports published by the International Telecommunication Union (ITU), the United Nations agency responsible for providing official statistics on information and communication technologies [
In recent decades, due to the intrinsic properties of dynamical systems that exhibit chaos, a line of research called chaotic cryptography has been conceived, with increasingly promising results, which in parallel has meant progress in the study and definition of new dynamical systems, all this looking for more complex behaviors, a favorable condition to use a chaotic attractor in order to define cryptographic schemes with high security and performance indicators [
On the other hand, cellular automata have been applied in different contexts, particularly in the field of cryptography, where security schemes can be designed based on their evolution [
In summary, this paper describes the application of chaos theory and cellular automata, with the objective of designing a text encryption algorithm complemented with a steganographic scheme, a process that involved the selection of a chaotic attractor and the implementation of an edge detection method with cellular automata to hide a ciphertext. Some references found in the scientific literature, used to support the achievement of the results shown in this article, are presented below.
In [
In relation to the use of chaotic attractors, there is the work of [
Another work to remark is the one proposed in [
The model described throughout this article is part of the results of the research project entitled "Encryption models based on chaotic attractors" developed within the Faculty of Engineering of the Universidad Distrital Francisco José de Caldas, which gave rise to the development of undergraduate thesis, such is the case of [
To achieve the objectives proposed in this work, the basic elements for the construction of the crypto-steganographic model are presented below, following a deductive, correlational, and mixed methodology. Based on general principles of mathematics and artificial intelligence, the design of cryptographic and steganographic schemes focused on large information is carried out, emphasizing that although cause-effect relationships are considered, it is not possible to control all the factors involved, since chaotic systems have a high sensitivity to the initial conditions and their parameters, which leads to both qualitative analysis to characterize the level of complexity of the security model, and quantitative analysis to validate the security and performance of the model. Figure 1 shows a synthesis of the process proposed in this work, which is described in sections 2.1 - 2.7.
2.1 Creation and secure exchange of keys
The OpenSSL [
The hashlib library, included in the Python installation [
Part of the 32-byte string is used to define the initial conditions of the Tinkerbell chaotic attractor, which in addition to being two-dimensional is discrete and is modeled by means of the System of Equations given in expression (1)
This system displays complex behaviors in its trajectories in phase space, presenting both periodic and chaotic orbits depending on the choice of parameters, such as in [
Similarly, the Tinkerbell attractor, which is used throughout this work to generate pseudo-random sequences, employed for both processes, namely: steganographic and cryptographic, exhibits high sensitivity to initial conditions, which can be verified by calculating Lyapunov exponents. For the particular case where b oscillates between -0.6 and -0.5, it can be seen that there are both positive and negative exponents, as shown in Figure 3.
2.2 Key segmentation
The 32-byte string obtained in the previous step are used as follows: the first 12 bytes as initial conditions of the Tinkerbell attractor (6 for X and 6 for Y) generating a pseudo-random sequence for the encryption process. In order to increases the confusion and diffusion properties in the cryptosystem and since both the cipher sequence and the cryptogram are stored in one-dimensional arrays, the next 4 bytes are used to define shiftings, 2 in the cipher sequence and 2 in the cryptogram.
Likewise, for the steganographic process, the other 16 bytes are divided into 4 groups used as shown in Figure 4. Initial conditions of the Tinkerbell chaotic system are given by the bytes from 16 to 27, which is evolved to randomly choose several pixels belonging to the edges of the carrier image in which the cryptogram will be embedded, this number is determined by the length of the cryptogram. For a cryptogram of n characters, i.e., 8n bits, ideally, n edge pixels are required, as 8 bits will be hidden in each pixel. However, it is highly recommended that the number of edge pixels in the image significantly exceeds n, to avoid modifying all of them.
To increase the dispersion of the data to be hidden in the selected pixels and to make the integrity verification code more sensitive to key variations, the last 4 bytes of Figure 4 are used to define shiftings, two for the arrays containing the coordinates of the pixels chosen in the steganographic process and the other two for the stegoimage verification code.
2.3 Binarization of the carrier imagine and edge detection
For this process, it was necessary to transform the original image to gray scale in order to avoid independent manipulation of each RGB layer. Subsequently, the resulting image is binarized by applying a Gaussian filter with reflective edges, for which neighborhoods of 9 x 91 pixels were taken as a basis to generate the reference threshold.
Since the use of cellular automata in edge detection, whose behavior is described in [
In this case, the cellular automaton is rectangular, the states correspond to zero or one, in [
Based on the identifiers shown in Figure 6, in [
Rule 29 (bottom edge): P29=P16⊕P8⊕P4⊕P1
Rule 113 (left edge): P113=P64⊕P32⊕P16⊕P1
Rule 263 (right edge): P263=P256⊕P2⊕P4⊕P1
Rule 449 (upper edge): P449=P64⊕P128⊕P256⊕P1
The entries corresponding to each of the identifiers involved in a given rule are all equal to 1 and the other ones are equal to 0. The symbol ⊕ denotes the XOR operation and Pj corresponds to the input position associated with the identifier j. The edge detection process begins by selecting one of these 4 rules and overlaying the table from Figure 6 to iteratively match the entries of the identifiers that are part of the rule with each of the pixels to be evolved, as illustrated in Figure 7.In this case, rule 29 is selected, and the pixel to be evolved is shaded in pink.
Subsequently, an XOR operation is performed on the 4 obtained results. This output replaces the value of the pixel shaded in pink in the original binarized matrix, as shown in Figure 8.
After completing the previous process, the edges of the image are obtained for each of the pixels in the binarized image.
2.4 Encryption strategy
The model presented in this paper is applied to text files with n characters, encoded in binary. The Tinkerbell system is then iterated 4n times, generating two sequences, corresponding to the X and Y axes of the system's phase space. Each of the 8n entries is used to encrypt one bit of the text. The initial conditions are given as part of the user’s key.
In order to store the result of the iterations of the X and Y components, two one-dimensional arrays are generated using the single precision format provided by the IEEE 754 standard, which allows storing numbers with more than 38 decimal digits. Subsequently, the eighth decimal digit of each entry of the X array and the sixth of the entries in Y are selected, in order to apply to each of these two digits the module 2 operation, obtaining a cipher sequence of zeros and ones, with the same length of the text message expressed in binary.
To introduce confusion and diffusion in the cryptosystem and make it more robust against cryptanalysis attempts, both the encrypting sequence and the binary text are shifted several positions, as defined in the key segmentation section, schematized in Figure 4, this shift is determined according to the key and the length of the original message.
Subsequently, to obtain the cryptogram, further hide the relationship between the ciphertext and the original text and distribute the redundancy of the language, minimizing the risk of differential and statistical attacks, a bitwise XOR is performed between the ciphertext and binary sequences, then a shifted is applied to the result.
2.5 Hiding the cryptogram
Once the length of the cryptogram and the list of edge pixels of the image chosen by the user have been obtained, those where the cryptogram bits will be hidden are selected, for which the Tinkerbell attractor is iterated again with other initial conditions until a sequence of the same length of the cryptogram is obtained, this sequence is reordered from smallest to largest by executing in parallel the same changes of position in the list of edge pixels.
Subsequently, according to the length of the cryptogram, using the Permuted Congruential Generator (PCG) pseudo-random number generator, a finite number of pixels is selected from the reordered list of edge coordinates where the cryptogram will be embedded. The coordinates are stored in two one-dimensional arrays corresponding to the indices i, j that allow to determine the position of any pixel in the image, indicating height and width, respectively.
With the sequence used to reorder the pixels of the edges, another sequence composed of ones and zeros is obtained, taking the same position in each decimal value generated. If an odd number appears in the selected position, a one is assigned, otherwise a zero is assigned.
Since the arrays and the random sequence have the same length, the sequence is traversed from the initial position i = 0, and if there is a zero, the coordinate at that position, in the two arrays, is left untouched, but if in the sequence there is a one this coordinate is passed to the end of the list.
On the previously selected and processed pixels, the Least Significant Bit (LSB) technique is applied to hide the cryptogram bits, three in each of the red and green layers and two in the blue layer. This makes it difficult to carry out a successful steganalysis process.
2.6 Stegoimage verification code
In order to ensure the integrity of the stegoimage during the communication process, a Verification Code mechanism has been implemented. The Verification Code is designed to be corroborated by the receiver and involves a series of steps: (i) XOR Operation on RGB Layers: Initially, an XOR operation is performed on the pixels belonging to the three RGB layers of the stegoimage. This operation results in a two-dimensional matrix. (ii) Consecutive XOR Operations: Subsequently, consecutive XOR operations are applied within the obtained matrix, both horizontally (between the rows) and vertically (between the columns). This process generates two one-dimensional arrays. (iii) Array Concatenation: The first 32 values of each of the generated one-dimensional arrays are concatenated, resulting in a final Verification Code with a length of 64 bits. This Verification Code, once generated, serves as a checksum mechanism, ensuring the integrity of the stegoimage during transmission and reception.
2.7 Information recovery process
Due to the symmetric nature of the steganographic cryptosystem, the process of recovering the original message is the reverse of the sender's operation, using the same key at both ends and ensuring the integrity of the information through the verification code. If these last two conditions are not complied with, it will be impossible to recover the original message. Figure 9 illustrates the processes of encryption, steganography, and recovery of the original information, proposed in this document.
The algorithm presented in this work is summarized in Figure 10, emphasizing that each of the stages carried out in the proposed crypto-steganographic process is schematized there. Security and performance tests were applied on texts of different lengths and different carrier images, to validate this proposal, finding in all cases indicators consistent with the standards found in current scientific literature on security.
As a case study, we present the results using the key 5J*`;`ltsd;hwRf%e%.mqQ, to encrypt a text message of 2979 characters and hide it in the carrier image Baboon (Figure 11(a)). Figure 11(b) shows the stegoimage with the encrypted text, highlighting that changes are not evident to the human eye.
Despite the above, when generating and visualizing the grayscale frequency histograms of the two images, slight differences are found, however, the cumulative frequency histograms seem to be identical, which reduces the risk of arousing suspicion that hidden information is stored on the image. Figures 12(a) and 12 (b) show the histograms corresponding to the original image and the stegoimage, respectively, while the cumulative frequency histograms of these images are presented in Figures 12(c) and 12(d).
Additionally, histograms of the original Baboon vs Baboon stegoimage (Figure 11) were obtained for each RGB layer, which are presented in Figure 13, showing high similarity between these histograms, a favorable situation in the context of steganography.
Another indicator considered to evaluate the differences between the carrier image and the stegoimage, in grayscale, was the calculation of the local entropy, which measures precisely the level of randomness of the pixels of an image [
The Shannon entropy values for the carrier image and for the stegoimage were respectively 7.380353762612659 and 7.380578390214877, finding a minimum difference of about 0.00022, which allows us to ensure that the level of disorder in the pixels of both images is almost identical, a scenario that is pursued in any steganographic process.
The Peak signal to noise ratio (PSNR), Mean Square Error (MSE), Normalized Root Mean Square Error (NRMSE), Structural Similarity Index (SSI) indicators presented in Table 1 were also calculated, the values obtained allow us to affirm that the differences between the original image and the stegoimage are really minimal, a fact that supports the security level of the proposed algorithm.
| Indicator | Value obtained |
| PSNR | 51.147060 |
| MSE | 7.678X10-6 |
| NRMSE | 0.0051064 |
| SSI | 0.9986450 |
A reference for the development of this proposal was the work of Setiadi [
| Imagen | Original Entropy | Stegoimage Entropy |
| Lena | 7.49899 | 7.49957 |
| Baboon | 7.38035 | 7.38057 |
| Airplane | 6.73475 | 6.73082 |
However, [
| Algorithm used | Message length | MSE | PSNR | SSI |
| Proposed in [ |
16384 bits | 0.0554 | 60.697 | 0.9997 |
| Proposed in this work | 23832 bits | 7.678x10−6 | 51.147 | 0.9986 |
Another fact to highlight in the proposed model, which results in good indicators, is related to the use of the Tinkerbell chaotic attractor in the pseudorandom reordering of the list of pixel coordinates that can be used to embed information, so that the values variations are distributed throughout the image, minimizing the possibility of finding a relationship between these values.
Regarding the sensitivity analysis of the key and Pearson’s key space, Figure 15 shows the result of encrypting a text of 488 characters with two keys that differ only in one digit, obtaining completely different cryptograms, which indicates that small changes in the key generate completely different results, a situation desired in the field of cryptography.
Since the SHA3-256 hash function is used for key generation, the key space is larger than the obtained in [
| Algorithm | Key space | Correlation coefficient |
| Proposed in [ |
2183 | -0.0014 – 0.00043 |
| Proposed in this work | 2256 | 0.0039789– 0.001112719 |
These results show that the proposal presented here is resistant to brute force attacks.
However, the steganographic process carried out with the two cryptograms obtained in Figure 15, leads to obtain stegoimages identical to the human eye, as shown in Figure 16 and as corroborated through the calculation of Pearson’s correlation coefficients.
In turn, to evaluate the randomness of the pseudorandom cipher sequences generated by the Tinkerbell chaotic attractor, the statistical tests proposed by the National Institute of Standards and Technologies (NIST) were used, finding that almost all of these 15 tests were passed, a favorable situation for the purpose of the validation performed. The results, for a string of 60.000 bits obtained from the Tinkerbell attractor, are shown in Figure 17.
The main contribution of this work is the combination of chaotic attractors and cellular automata in both encryption and text hiding within an image. By means of the Tinkerbell chaotic attractor and the use of cellular automata, a scheme for encrypting and hiding text on the edges of an image was designed, which, according to the performance tests carried out, can be classified as highly secure and reliable for use in real environments.
The work presented here highlights the strengths provided using multiple mathematical foundations combined in the process of encryption and hiding of information, a requirement that has taken on greater relevance due to massification of new forms of communication accessible to the vast majority of the population, increase of devices connected to the Internet, advances in telecommunications and population growth worldwide as well as the pandemic caused by COVID19, which increased the need for secure exchange of large information through public channels.
It is important to note that in the proposed scheme images with different color contrasts must be used, the algorithm works independently of the size of the text to be hidden. If the text exceeds the capacity of pixels belonging to the edges of the carrier image, it is possible to carry out the steganography process using more than one image.
The authors express their gratitude for the support received from the Center for Research and Scientific Development (CIDC) of the Universidad Distrital Francisco José de Caldas for the execution of the research project that led to this article.
The authors declare that there is no conflict of interest.