Skip to content
/ Sangeet Public

An XML Dataset for Hindustani Classical Music

Notifications You must be signed in to change notification settings

cmisra/Sangeet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sangeet

An XML Dataset for Hindustani Classical Music

Organization of the Dataset

The dataset is a collection of XML files. Each XML file corresponds to a composition. Each XML file is named as [raag-name]-[composition index belonging to the raag]-[page number of the book Kramik Pustak Malika].xml.

Currently the dataset consits of 116 XML files belonging to raag Bhairav, Todi, and Poorvi. The frequencies of compositions for each of these three rags are 42, 39, and 35 respectively.

Music-sheet Visualization

The music-sheets of the compositions is rendered using Ome Swarlipi fonts and style engine. A sample HTML music-sheet is given in Visualization directory. It also contains a converter to transform XML file to music-sheet HTML file. You will notation.css file in the directory as the HTML music-sheet file in order to visualize it in Devanagari script. This file is also included in the directory.

Query the Dataset using XQuery

There are four queries written in XQuery inside XQuery Files. In order to run the queries an XML database needs to be created from the XML files. We have used BaseX for that. The queries can be written and run the BaseX editor itself.

Machine Learning

To use the dataset to build ML classifiers for raag classification problem, we have transformed the dataset into a csv file which contains the frequency distribution of the notes for each composition and the raag of the same. The csv file can be found inside Machine Learning directory. The name of the csv file is Bhatkhande-Dataset.csv. The csv file can also be generated using the XQuery file freq-dist-notes-to-csv.xq present inside XQuery Files directory. We have also included the Python code to upload the dataset and run ML algoritms on it as .ipynb file.

About

An XML Dataset for Hindustani Classical Music

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published