In This Article
In this article, we’ll explore how to work with bitarray in Python using the bitarray
library.
Computers store data in the form of bits.
But what exactly are bits?
Bits hold binary values: 0
or 1
.
We can easily make use of a specialized data structure called a bit array to make it easy to manipulate collections of bits. Harnessing the power of individual bits often requires specialized data structures, like bit arrays.
First understand what are bits?
The term “bit” is a portmanteau of “binary digit“.
In the binary system, digits are used in base 2
, rather than the base 10
system we use in everyday life.
This means that a bit, which is the smallest unit of data in a computer, can have only one of two values: 0
or 1
.
These two values can also represent other binary opposites such as true/false, on/off, yes/no, and so on. The choice of what each value represents is arbitrary and can be chosen to suit the situation.
Computer information is stored in the form of bits.
For example, a text character is typically stored as a sequence of 8
bits (known as a byte), with each bit representing a binary value.
In terms of data storage and transmission, bits are often grouped into larger units such as bytes (8
bits), kilobytes (1024
bytes), megabytes (1024
kilobytes), and so on.
Thus, a single bit represents the most basic unit of data a computer can process. However, when we need to manage numerous bits efficiently, bit arrays come into play.
What are BitArrays in Python?
Bit arrays (also known as bitmaps, bit sets, bit strings, or bit vectors), provide an efficient way to store sequences of bits.
Rather than using traditional data types like integers or strings, bit arrays allow for compact representation, especially when dealing with large datasets or sets of boolean values.
The Python library, `bitarray`, allows easy manipulation of bit arrays in Python.
In the following sections, we will see how to install, and make use of bitarray library.
How to use Python bitarray
library?
While Python offers us built-in collection data structures like lists, the `bitarray` library provides the specific functionalities for working with bit arrays.
It makes it easy to create, manipulate, and perform other operations on bit arrays.
To make use of bit arrays in Python, simply install the library using pip
:
pip install bitarray
In addition, conda
packages are available (both the default Anaconda repository as well as conda-forge support bitarray
):
conda install bitarray
Now that we have the bitarray
library installed, we can now start exploring how to work with bit arrays in Python.
Getting started with BitArrays in Python
The bitarray
library allows us to create and manipulate bit arrays.
In the sections that follow, we will explore-
- How to create bit arrays
- Perform logical operations on bitarrays
- Slice bitarrays
- Concatenate bit arrays
- count the vales in bitarrays
- Perform search within a bit array
So, let’s get started!
Create BitArrays in Python
To make use of a bit array, we have to create it first.
The code below shows how we can create bit arrays in code:
from bitarray import bitarray
empty_array = bitarray() # Creates an empty bitarray
print("empty array:",empty_array)
bit_array = bitarray(10) # Creates a bitarray of size 10
print("bit array:", bit_array)
Our code example creates two bit array instances.
The first one, empty_array
, is an empty bit array.
We used the bitarray()
constructor without any arguments to create an empty bit array.
We might need such bit arrays when we do not know in advance how many values we will have.
If we know in advance how many bits we need to store, we can specify it while constructing the bitarray
instance.
To do this, we pass the desired size as an argument to the bitarray()
constructor.
This is what our code did when creating the second bitarray
instance. In this example, bit_array
is an instance of the bitarray
class with a size of 10
.
When a bit array is created with a fixed size, all bits are initially set to 0
(as seen in the output).
As we will discuss later, we can modify the individual bits in the array using indexing.
Initializing bit arrays using setall
method
bitarray
has a method called setall
.
This method allows you to set all bits in a bit array to either 1
or 0
.
This can be particularly useful when you want to initialize a bit array with a specific value.
Here’s how you can use the setall
method:
from bitarray import bitarray
bit_array1 = bitarray(10)
bit_array2 = bitarray(10)
bit_array1.setall(0)
bit_array2.setall(1)
print("bit_array1:", bit_array1)
print("bit_array2:", bit_array2)
The code snippet defines two variables, bit_array1
and bit_array2
, both having the same size of 10 bits.
The setall
method takes a single argument, which is the value to be set for all bits. bit_array1
has all bits set to False
(or 0
), and bit_array2
is set to all True
(or 1
).
This method modifies the bit array in-place, meaning that it doesn’t return a new bit array, but instead changes the values of the bits in the existing bit array.
Initializing bit arrays from list
Another way to initialize a bit array is to use a list of bits, or booleans.
This can be particularly useful when you have a sequence of boolean values that you want to store efficiently.
See the example in the code snippet below:
from bitarray import bitarray
bit_array1 = bitarray([1, 0, 1])
bit_array2 = bitarray([True, False, True])
print("bit_array1:", bit_array1)
print("bit_array2:", bit_array2)
If we run the code, we see from the output that both bit_array1
and bit_array2
have the same contents:
In this example, bit_array
is a bitarray that is initialized with the boolean values True
, False
, and True
.
These boolean values are represented as bits in the bitarray, with True
represented as 1
and False
represented as 0
.
This method of initializing a bit array is very intuitive, as you simply provide a list of the bit values you need.
The `bitarray` constructor will then create a bitarray with the corresponding bit values.
Note: The order of the bit, or boolean, values in the list corresponds to the order of the bits in the bitarray. That is, the first value in the list corresponds to the first bit in the bitarray, the second boolean value corresponds to the second bit, and so on.
Initializing bit arrays from binary string
Another interesting way in which we can initialize a bitarray
is by supplying a binary string.
This can be particularly useful when we have a binary representation of data that we want to store.
Here’s how we can initialize a bit array from a binary string:
from bitarray import bitarray
str_array = bitarray('101010') # Initializes from binary string
print(str_array)
In this example, the bitarray
instance is initialized with a string of bits, or a binary string, 101010
.
When the code prints out str_array
, we get the following output:
The constructor interpretes the string as a sequence of bits., with 1
representing True
and 0
representing False
.
Like the case of using a list to initialize a bit array, the order of the bits in the string also sorresponds to the order of the bits in the bitarray.
Adding more bits to bit arrays
It is also possible to add more values to a bitarray.
We can do so using either the append
, extend
, or insert
methods.
See the code for how to add bits using these methods.
from bitarray import bitarray
bit_array = bitarray('1001')
print("Initial bit array:", bit_array)
# Append values True, True
bit_array.append(True)
bit_array.append(True)
print("bit array after appending 'True, True':", bit_array)
# Extend bitarray with string '1100'
bit_array.extend('1100')
print("bit array after extending with '1100':", bit_array)
# Extend bitarray with list [True, False, True]
bit_array.extend([True, False, True])
print("bit array after extending with the list '[True, False, True]':", bit_array)
# Insert Boolean value True
bit_array.insert(0, True)
print("bit array after inserting 'True' at the first position:", bit_array)
# Insert bit value 0
bit_array.insert(1, 0)
print("bit array after inserting '1' at the second position:", bit_array)
Our code has a series of calls to print
so show what happens when each method is called.
The printouts are shown below:
bit_array.append
expects a single parameter; the bit to append.
Then it appends the bit to the end of the bit_array.
With bit_array.extend
, we can append a binary string, or a list of bits to an existing bit array.
All we need to do is to specify the value we want to use in extending the existing bit array.
Finally, the method insert
requires an argument for the position to insert in the bit array, and another argument for the bit to insert.
Any method can be selected based on the needs of the code we are writing.
Concatenating bit arrays into one
Concatenation merges two or more bit arrays to get a single, larger array.
Concatenation in bit arrays is straightforward and follows the same principles as concatenating strings or lists in Python.
from bitarray import bitarray
# Initialize bit arrays
bit_array1 = bitarray('1010')
bit_array2 = bitarray('0101')
# Perform concatenation
combined_array = bit_array1 + bit_array2
print(combined_array)
The printout for the code execution is shown below:
From the output, we can see that the two bitarrays have been joined end-to-end to form a new bitarray.
Accessing values from a bit array
Accessing values from a bit array is similar to list indexing.
We just need to specify the index within square brackets as seen in the following code:
from bitarray import bitarray
# Create a bitarray
bit_array = bitarray('1100')
# Get the first and last values
first_value, last_value = bit_array[0], bit_array[-1]
print("The first value is:", first_value)
print("The last value is:", last_value)
From our code example, to access the first bit of bit_array
we indexed bit_array
at position 0
.
Accessing the last bit in bit_array
was done in a way similar to accessing the last item in a Python list; we indexed bit_array
at position -1
.
See the output below for the printout of our Python script.
With this indexing capability, we can also modify bits in specific positions, just like in the following code:
from bitarray import bitarray
# Create a bitarray
bit_array = bitarray('11101100')
# Set the value at index 3 and 6 to True
bit_array[3] = 1
bit_array[6] = 1
print("The new bit array is:", bit_array)
The printout shows that we successfully modified the bits at indexes `3` and `6`.
To access multiple bits at once, we can slice the bit array just as we would with a list.
Slicing allows us to access a range of bits in the bitarray.
Here’s an example:
from bitarray import bitarray
# Initialize bit array
bit_array = bitarray('1010111000')
# Perform slicing
subset = bit_array[2:8]
print(subset)
In this example, we used slicing (`[start:stop]`), which returns all elements from the `start`-th element up to but not including the `stop`-th element. In our case, the sart index was `2`, while the stop index was at position `8`.
The output below shows the result of our slicing.
Logical operations on bitarrays
The bitarray
supports a number of bitwise logical operations.
Here’s a brief overview of these operations:
- Bitwise AND (
&
): Returns1
if both the bits are1
, else it returns0
. - Bitwise OR (
|
): Returns1
if either of the bit is1
, else it returns0
. - Bitwise NOT (
~
): Returns0
if the bit is1
, or1
if the bit is0
. - Bitwise XOR (
^
): Returns1
if one of the bits is1
and the other is0
, else it returns0
. - Bitwise right shift (
>>
): Returns the bit array (on the left of the operator) shifted to the right by the quantity specified (on the right of the operator), and fills0
in the voids left as a result. - Bitwise left shift (
<<
): Returns the bit array (on the left of the operator) shifted to the left by the quantity specified (on the right of the operator), and fills0
in the voids left as a result.
Here’s a code example of how you can use these operations:
from bitarray import bitarray
# Creating bitarrays
bit_array1 = bitarray([False, False, True, True])
bit_array2 = bitarray([False, True, False, True])
# Logical operators
bit_and = bit_array1 & bit_array2 # Bitwise AND
bit_or = bit_array1 | bit_array2 # Bitwise OR
bit_not = ~bit_array1 # Bitwise NOT
bit_xor = bit_array1 ^ bit_array2 # Bitwise XOR
# Shift operators
bit_rshift = bit_array1 >> 1 # Bitwise right shift
bit_lshift = bit_array1 << 2 # Bitwise left shift
# Print the results
print("AND: ", bit_and)
print("OR: ", bit_or)
print("NOT: ", bit_not)
print("XOR: ", bit_xor)
print(f"Right Shift of {bit_array1}: ", bit_rshift)
print(f"Left Shift of {bit_array1}: ", bit_lshift)
In this code snippet, bit_array1
and bit_array2
are initialized with bit arrays '0011'
and '0101'
, respectively.
Subsequently, logical operations such as AND, OR, XOR, and NOT are performed between these bit arrays, showcasing their utility in bitwise manipulation.
The results of these operations are then printed out as seen below:
Counting set and unset bits
Counting the number of set bits, or ones, and unset bits, or zeros, within a bit array is a common operation.
The bitarray
library simplifies this task by providing the count
method to count bits that are 1
or 0
.
See the example below for how this can be done:
from bitarray import bitarray
# Initialize bit array
bit_array = bitarray('100011110000')
# Count set and unset bits
set_bits_count = bit_array.count(True)
unset_bits_count = bit_array.count(0)
print("Number of set bits: ", set_bits_count)
print("Number of unset bits: ", unset_bits_count)
The first call to the count
method specifies that we want to count the bits that are set to 1
or True
.
From the output we see that we have 5
set bits. The second call to the count
method counts the number of unset bits (which is 7
in this case).
See the output for the printouts:
Its time to put what we now know about Python bit arrays, it is time to put it to use by writing a simple prime sieve algorithm.
Example: Sieve of Eratosthenes
The Sieve of Eratosthenes is an algorithm used to find all prime numbers from the smallest, 2
, to a specified integer.
In this section, we will explore how to make use of bit arrays to make the algorithm more efficient.
The sieve algorithm iteratively marks the multiples of each prime number as composite, gradually sieving out non-prime numbers until only primes remain.
This method efficiently identifies primes by eliminating multiples of each prime starting from the smallest prime, 2
.
In the context of this algorithm, each index in the bit array represents a number, and its corresponding value indicates whether the number is prime or composite.
Let’s delve into a Python implementation of the Sieve of Eratosthenes algorithm using bit arrays:
from bitarray import bitarray
def sieve_of_eratosthenes(n):
primes = bitarray(n + 1)
primes.setall(True)
primes[:2] = False
for i in range(2, int(n ** 0.5) + 1):
if primes[i]:
primes[i * i::i] = False
return [i for i in range(n + 1) if primes[i]]
# Example usage
print(sieve_of_eratosthenes(100))
From the script we wrote, we defined a function sieve_of_eratosthenes
that accepts a single parameter, n
, which represents the upper limit, and returns a list of primes up to that limit.
The algorithm initializes a bit array primes
with a size of n + 1
, setting all values initially to True
.
Then, it marks index 0
and 1
as non-prime by setting their corresponding indices in the bit array to False
.
Then, it iterates over each number from 2
to the square root of n
, marking multiples of each prime as non-prime (by setting those bit positions to False
).
Finally, our function returns a list of prime numbers up to n
based on the values in the bit array. The list of primes is seen in the output below:
In conclusion, understanding bit arrays can significantly enhance the performance and efficiency of various algorithms and data manipulation tasks in Python.
With the `bitarray` library, handling boolean data becomes intuitive and streamlined, opening doors to a wide range of applications in computer science and programming.