Python Pandas: Convert CSV to Excel for Phoible and Ruhlen Datasets

This code snippet showcases how to utilize Pandas to read CSV files containing phonetics data from the Phoible and Ruhlen datasets. It then saves these datasets as Excel files for convenient analysis. This example involves the following steps:

Importing necessary libraries: The code begins by importing the Pandas library, which is crucial for data manipulation and analysis in Python.
Reading CSV files: The code reads in four CSV files: 'phoible_Features_Fonetikode.csv', 'Ruhlen_Features_Fonetikode.csv', 'phoible-phonemes.tsv', and 'phoible-segments-features.tsv'. It uses a regular expression to define the separator between data columns, allowing for flexibility in handling complex CSV structures.
Saving data as Excel files: The code saves the read dataframes as Excel files, using the to_excel() function provided by Pandas. The index=False argument ensures that the index is not included in the Excel output.

import pandas as pd

# reading in phoible_Features_Fonetikode.csv
phoible = pd.read_csv('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/phoible_Features_Fonetikode.csv', sep=r'\t(?=(?:[^"]*"[^"]*")*[^"]*$)', dtype=str, engine='python')

# reading in Ruhlen_Features_Fonetikode.csv
Ruhlen = pd.read_csv('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/Ruhlen_Features_Fonetikode.csv', sep=r'\t(?=(?:[^"]*"[^"]*")*[^"]*$)', dtype=str, engine='python')

# reading in phoible-phonemes.tsv
phoible_phonemes = pd.read_csv('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/phoible-phonemes.tsv', sep=r'\t(?=(?:[^"]*"[^"]*")*[^"]*$)', dtype=str, engine='python')

# reading in phoible-segments-features.tsv
phoible_segments = pd.read_csv('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/phoible-segments-features.tsv', sep=r'\t(?=(?:[^"]*"[^"]*")*[^"]*$)', dtype=str, engine='python')

# saving phoible-phonemes.tsv as Excel file
phoible_phonemes.to_excel('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/phoible-phonemes.xlsx', index=False)

# saving phoible-segments-features.tsv as Excel file
phoible_segments.to_excel('D:/pub/工作相关/国社科项目/articulationFromPhoibleRuhlen/phoible-segments-features.xlsx', index=False)

This code snippet provides a foundation for effectively handling and transforming phonetics data from Phoible and Ruhlen datasets. By leveraging Pandas capabilities, researchers can streamline their data analysis workflow and generate valuable insights from these crucial resources.

Python Pandas: Convert CSV to Excel for Phoible and Ruhlen Datasets