This package represents the implementation of the collaborative filtering algorithm of Breese et al (1998) in Matlab programming language.
Based on input data, I compute the unique vector of users and movies. The movie-rating matrix (movie_rate_mat) which rows are representing users and columns are
representing movies are being populated based on input data. I define a logical matrix (rated_mat) with 1 suggesting that specific user rated the movie and 0 otherwise.
Mean value of all ratings per user is calculated. In order to compute the predicted vote (equation 1), I generate the user weight matrix (user_weight_mat) according to
the equation 2. This section is the most time consuming part of the code. Therefore, the code first looks for the data structure of the input file, if it finds it
in the same folder, it will use it. Otherwise it will generate the necessary variables that will be time consuming. The correlation between users is defined based on
movies for which users have recorded votes. In practice I compute the all movies weight and multiply the rated vector of each user, therefor only common rated items
remain. Upon running the program (providing some time for loading the data) user can input the user id and number of recommended movies through the following format:
> userId, number of recommend movies
If there are fewer movies in the database (than requested), program will return the available items. Use 0,0 for termination.
Go to the GitHub page of the package.
Reference: Breese, John S., David Heckerman, and Carl Kadie. "Empirical analysis of predictive algorithms for collaborative filtering."
Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1998.
A 1D site response analysis code to conduct linear and equivalent linear site response analysis (written in Matlab). I developed the code to have an open source platform to test different damping model and also compare simplified 3D equivalent linear method including DRM element with 1D solutions. The code solves the elastic wave equation by approximating the spatial variability of the displacements and the time evolution with finite elements (1D linear element) and central differences, respectively. The Newmark implicit solution in time domain is also implemented. Here are some features of the program:
As a part of a data science specialization capstone project (by Johns Hopkins University and Coursera), I developed a
prediction application for easier typing. The application receives a word or sequence of words as an input and predicts
the most probable upcoming word. In order to generate the n-grams, I used a corpus of formal and informal contemporary American
English (including news, blogs, and Twitter). The probability of unseen words is assigned using the Katz back-off method.
The application is uploaded on the Shiny server through the following link:
https://naeem.shinyapps.io/shinyapp-NLP/
Please take a look at the application and let me know your thoughts. If you are interested in details, please refer to the application repository on my Githiub account.
https://github.com/Naeemkh/DataScienceCapstone
I used quanteda package in R to process the corpus. If you are interested in R programming, natural language processing (NLP),
regular expression, or developing a Shiny application, I encourage you to take a look at the source codes. In order to run the
application in your computer, clone the repository and source the myapp.r file.
Data processing is an important step of many scientific studies. Type and size of data as well as the type of processing can determine the processing method. In seismological studies, we mostly deal with numerical data with different customized processing. In many cases, users write a function in a programming language (e.g. MATLABR , Python, C, Fortran) and process the data. Depending on the type of data and processing, we may need several steps of processing data in a project. Also we may have to change a parameter and repeat the processing several times. Writing down the processing steps is a good practice to avoid confusion, keep track of the processing, and be able to return back and look for the probable bugs in case of wrong results. However, since it is not automated, you may unintentionally skip one step of processing or forget to properly document it. Even with complete documentation, in the case of finnding the problem in processing steps, you need to repeat the whole process and go through all the steps.
A Matlab code for conducting Probabilistic Seismic Hazard Analysis. Seismic source could be point, line, area source or any number and combination of them. Tavakoli and Pezeshk (2005) GMPE is implemented. However, one cand add his/her own GMPE. Final results are the plot of study region, source-to-site distance distribution and hazard curve.