Skip to content

This repo contains a trained random forest model for predict digital literacy.

License

Notifications You must be signed in to change notification settings

nsgLUMS/predict_DigitalLiteracy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

predict_DigitalLiteracy

This repository contains the trained Random Forest (RF) model for predicting the digital literacy of individuals using the best 7-item survey module (referred to as platform-neutral module) described in the paper, "Validated Digital Literacy Measures for Populations with Low Levels of Internet Experiences," The Journal of Engineering in Economic Development (Dev Eng), 2023

  • Paper Authors: Dr. Ayesha Ali, Dr. Agha Ali Raza, and Dr. Ihsan Ayyub Qazi (LUMS)
  • R code: This code provides a trained Random Forest (RF) model in the R language
  • Note: For any questions or comments, please email ihsan.qazi@lums.edu.pk

The repository contains the following files:

  • "DL_model.rmd": R file for making predictions from the trained model. It has contains an example.
  • "rf_model.rds": Trained RF model

The 7 items/questions in the survey module and their response options are as follows:

  1. Are you able to search/google things online? [Response Options: Yes(1); No(0)]

How familiar are you with the following computer and Internet-related items? Please choose a number between 1 and 5, where 1 represents no understanding and 5 represents full understanding of the item:

  1. Internet [Response Options: 1-5]
  2. Browser [Response Options: 1-5]
  3. PDF [Response Options: 1-5]
  4. Bookmark [Response Options: 1-5]
  5. URL [Response Options: 1-5]
  6. Torrent [Response Options: 1-5]

Model Card

  • Input: one or more observations, where each observation correponds to responses to the 7 questions above
  • Input Order: (term_pdf, term_internet, term_browser, term_bookmark, term_url, search, term_torrent)
  • Example input: (3, 5, 4, 2, 3, 1, 2)
  • Output: for each observation the model predicts a digital literacy score between 0 and 1
  • Model: A random forest model trained using 100,000 trees. We used the randomForest library in R and employed default values for other hyperparameters.
  • Model Performance: R^2 over OOB samples was 0.8 and MSE was 0.019
  • Data: The model was trained over a sample of 143 individuals from Pakistan with different levels of digital literacy (please refer to the paper for a detailed description of the model)
  • Suitability: The model is best suited for populations comprising a high proportion of individuals with low levels of digital literacy.

About

This repo contains a trained random forest model for predict digital literacy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published