In This Article
In this tutorial, we’ll learn how to readlines without newline in Python. Find two ways to use readlines without getting newlines with example code.
Working with raw data files is a common task in data analysis using Python. Raw data files are usually organized into lines of data, where each line is a single record or observation. However, each line is separated from the next using a newline character, which could interfere with our data analysis if we do not remove them.
Why readlines without newline in Python?
When reading data from a file using Python file.readlines, the function returns a list of strings, each ending with a newline character.
The code example below shows this happening:
# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
# Read the lines of data into a list
lines = fr.readlines()
# Display all the lines of data
print(lines)
The output we get is:
The above printed list of lines shows that each line has a newline character \n
, appended to it. Therefore, the main question is how do we remove newlines when using readlines
function?
Note
If you are not familiar with file object methods, you can read this article on file input output.
Two ways in Python to read lines without new lines
There are several ways that we can use to make sure that no newlines occur at the end of each line returned from the Python readlines
function.
In the examples that follow, we will make use of the file people.txt
. The text file contains the following information about people:
Since people.txt
has several lines of data, we can use it to demonstrate how to remove new lines from each line.
Python Readlines Without Newline Using Python rstrip()
rstrip
is a string method that removes all whitespaces at the end of a string instance. Whitespace includes tabs (\t
), newlines (\n
), and spaces. For example, the code below displays the string Hello
without the newline character at the end:
print("Hello\n".rstrip('\n'))" # Outputs: Hello
To get a list of strings without newlines, we have to call rstrip
on each element of the list returned by readlines
. Study the following code to see how we did it:
# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
# Read the lines of data into a list
lines = fr.readlines()
# Use method rstrip in a list comprehension to
# clean up the lines of data
lines = [line.rstrip('\n') for line in lines]
# Display all the lines of data
print(lines)
The code segment produces an output that contains a list of strings cleaned up of the newline characters. It looks like what follows:
Python Readlines Without Newline Using Python splitlines()
A file object also allows us to make use of read
method to read the data in a file.
Unlike readlines
, the read
method reads the whole file as a single string. Each newline in the returned string acts as the line delimiter within the file.
In order to separate each line, we make use of the splitlines
string method:
# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
# Read the lines of data as a single string
lines = fr.read()
# Use method splitlines to split the lines
# of data at the newline delimiters
lines = lines.splitlines()
# Display all the lines of data
print(lines)
Our code snippet produces similar output to the previous example:
The method splitlines
“splits” the string of data (that is returned from our call to the fr.read
method) into a collection of strings.
It looks for newline characters and makes the split at those locations where the \n
character is found.
We now want to put our newfound skill to practical use by applying it to a data analysis task.
A practical example of Python Readlines Without Newline
Analyzing Temperature log file in Python
Let’s consider a real data analysis scenario where you’re reading and processing data from a daily temperature log file (temperature.txt
).
In this scenario, our log file contains daily temperature records for the month of January for a particular location.
All temperature readings are in \degree C.
Our goal is to perform data analysis on this temperature data to gain insights or make weather-related decisions.
Our log file, temperature.txt
, contains entries in the following format: YYYY-MM-DD: TEMPERATURE
.
Each line represents the temperature recorded for a specific date.
Part of the file looks like the following:
Here’s the complete Python code:
from collections import namedtuple
def wrangle(temperature_file):
with open(temperature_file, 'r') as fr:
# Read the lines of data into a list
lines = fr.readlines()
# clean up the newlines in the lines of data
lines = [line.rstrip('\n') for line in lines]
# Split each line at the ":" character
lines = [line.split(':') for line in lines]
# Each sublist has a list of lists. Create a
# named tuple template/class to represent each item
LogInfo = namedtuple('LogInfo', ['date', 'temperature'])
# Create instances of the LogInfo tuple
data = [
LogInfo(datestr, float(temperature_str))
for datestr, temperature_str in lines
]
return data
def temperature_stats(data):
temp_values = [info.temperature for info in data]
# Get the number of days in the log file
num_days = len(data)
# Calculate the average temerature
sum_temp = sum(temp_values)
temp_avg = sum_temp/num_days
# Calculate the temerature range
min_temp = min(temp_values)
max_temp = max(temp_values)
temp_range = max_temp - min_temp
# Consruct the string to return
data_stats_str = "Temperature log stats\n" # The header
data_stats_str += f"{'-'*len(data_stats_str)}\n" # Underline the header
data_stats_str += f"Total Days: {num_days}\n"
data_stats_str += f"Average Temperature: {round(temp_avg, 1)}\n"
data_stats_str += f"Temperature Range: {temp_range}\n"
return data_stats_str
def main():
data = wrangle('temperature.txt') # Parse the log file
print(temperature_stats(data)) # Display the stats
# Run the main function
main()
The previous script defines a function that parses the data found in the temperature log file and another that computes some statistics about it.
Note
The article Context manager in Python goes into more details about using context managers.
Parsing temperature.txt
The wrangle
function is the part of our script that removes the Python newlines when reading the data from temerature.txt
.
Since a newline delimits each line in the file, ourwrangle
function implements one of the techniques we discussed in the previous sections to remove the newline characters.
wrangle
parses the contents of thetemperature.txt
file into a list ofLogInfo
named tuples, which makes it easy to access the data we need later.
Note
We used a named tuple as a simplified way to create object instances. If named tuples are new to you, check out our tutorial on named tuples.
In conclusion, before we analyze our data, we can make use of the new skills learned in this Python tutorial to read lines without new lines. This will eliminate some bugs that would have resulted from leaving the newlines intact.
Also, if you liked this tutorial, check out our docs for more how-to topics on Python. You might also like a tutor for Python programming. Happy exploring!