Browsing by Author "Rajesh Piryani"

Now showing 1 - 14 of 14

A linguistic rule-based approach for aspect-level sentiment analysis of movie reviews
(Springer Verlag, 2017) Rajesh Piryani; Vedika Gupta; Vivek Kumar Singh; Udayan Ghose
Aspect-level sentiment analysis refers to sentiment polarity detection from unstructured text at a fine-grained feature or aspect level. This paper presents our experimental work on aspect-level sentiment analysis of movie reviews. Movie reviews generally contain user opinion about different aspects such as acting, direction, choreography, cinematography, etc. We have devised a linguistic rule-based approach which identifies the aspects from movie reviews, locates opinion about that aspect and computes the sentiment polarity of that opinion using linguistic approaches. The system generates an aspect-level opinion summary. The experimental design is evaluated on datasets of two movies. The results achieved good accuracy and shows promise for deployment in an integrated opinion profiling system. © Springer Nature Singapore Pte Ltd. 2017.
Analysing author name mentions in citation contexts of highly cited publications
(CEUR-WS, 2019) Rajesh Piryani; Wolfgang Otto; Philipp Mayr; Vivek Kumar Singh
In this paper, we are analysing author name mentions in citation contexts of highly cited articles in a PLOS ONE corpus. First, we have identified author mentions in our corpus of citation contexts. Then, we examined frequent nouns and verbs in the neighbourhood of the identified author mentions using n-grams and utilized these top nouns and verbs to identify the most frequent patterns. We observed that most frequent patterns are associated with the methods which are proposed in the corresponding highly cited references. © 2019 CEUR-WS. All rights reserved.
Book impact assessment: A quantitative and text-based exploratory analysis
(IOS Press, 2018) Rajesh Piryani; Vedika Gupta; Vivek Kumar Singh; David Pinto
Books are an important source of knowledge to disseminate information. Researchers and academicians write books to propagate their innovative research or teachings amongst academic as well as non-academic audience. The number of books written every year is increasing rapidly. According to International Publisher Association (IPA) annual report 2015-2016, around 150 million different books were published worldwide in 2014-2015. Many e-commerce websites are also involved in selling books. A recent addition to book publishing world is e-books, which have really made it very simple to publish. While, availability of large number of books is good for readers, at the same time it is challenging to find a good book, particularly in scholarly settings. Researchers in the area of Scientometrics have attempted to view assessment of goodness of a scholarly book by measuring citations that a book receive. However, citations alone are not a true measure of a book's impact. Many a times people use the knowledge in a book without actually citing it. Also use of books in classroom settings or for general reading often is not reflected in terms of citations. Therefore, it is important to obtain users's opinion about a book from other forms of data. Fortunately, we have now some data of this sort available in form of reviews, downloads and social media mentions etc. Amazon and Goodreads, both of which provide the readers' views about a book, are two good examples. This paper presents an exploratory research work on using these non-traditional data about books to assess impact of a book. A set of Scopus-indexed computer science books with good citations as well as some other popular books in computer science domain are used for analysis. The reviews of books have been crawled in an automated fashion from Amazon and Goodreads. Thereafter sentiment analysis is carried out the text of reviews. Results of sentiment analysis are compared and correlated with traditional impact assessment metrics. The experimental analysis does not show a coherent relationship between citation and online reviews. Also, majority of the online reviews are found to be positive for large number of books in the dataset. As a related exercise, the Scopus citation data and Google scholar citation data for books are also compared. A high value of correlation is observed in these two. Overall the exploratory analysis provides a useful insight into the problem of book impact assessment. © 2018-IOS Press and the authors. All rights reserved.
Generating aspect-based extractive opinion summary: Drawing inferences from social media texts
(Instituto Politecnico Nacional, 2018) Rajesh Piryani; Vedika Gupta; Vivek Kumar Singh
This paper presents an integrated framework to generate extractive aspect-based opinion summary from a large volume of free-form text reviews. The framework has three major components: (a) aspect identifier to determine the aspects in a given domain; (b) sentiment polarity detector for computing the sentiment polarity of opinion about an aspect; and (c) summary generator to generate opinion summary. The framework is evaluated on SemEval-2014 dataset and obtains better results than several other approaches. © 2018 Instituto Politecnico Nacional. All rights reserved.
Google Scholar as a pointer to open full-text sources of research articles: A useful tool for researchers in regions with poor access to scientific literature
(Routledge, 2023) Vivek Kumar Singh; Satya Swarup Srichandan; Rajesh Piryani; Anurag Kanaujia; Sujit Bhattacharya
The access to knowledge is an important requirement for advancement of scientific and technological research and development of a country. Availability of resources is a crucial bottleneck for universities and colleges in developing countries. This leads to frequent use of pirate access sites like Sci-Hub by researchers. For instance, India has more than 900 universities and 40,000 colleges, and Africa has more than 1200 universities. Only a few of these institutions would have access to most of the research journals their scholars require. It is in this context that we tried to find out if there exist some resources which can provide links to open and free to download versions of scientific papers. Google Scholar, a heavily used resource for research article searches, is explored to see how effective it is in providing links to open access freely downloadable copies of scientific articles. The complete set of global scientific publications for the year 2016 are computationally analyzed through a web-mining approach, as an example, to see if Google Scholar is able to point to freely downloadable open text versions of scientific articles. Results show that Google Scholar points to full-text sources for about 69% of the articles queried, with about 43% of the articles having openly accessible full-texts. The results, thus, indicate that Google Scholar can be a useful tool for locating open access full-text versions of close to about half of the scientific articles of the world, which has special significance for under-developed and developing countries. © 2022 African Journal of Science, Technology, Innovation and Development.
Highly cited references in PLOS ONE and their in-text usage over time
(International Society for Scientometrics and Informetrics, 2019) Wolfgang Otto; Behnam Ghavimi; Philipp Mayr; Rajesh Piryani; Vivek Kumar Singh
In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, we analyse the citation contexts concerning their position in the text and their age at the time of citing. By selecting the perspective of highly cited papers, we can distinguish them based on the context during citation even if we do not have any other information source or metrics. We describe the top cited references based on how, when and in which context they are cited. The focus of this study is on a time perspective to explain the nature of the reception of highly cited papers. We have found that these references are distinguishable by the IMRaD sections of their citation. And further, we can show that the section usage of highly cited papers is time-dependent: the longer the citation interval, the higher the probability that a reference is cited in a method section. © 2019 17th International Conference on Scientometrics and Informetrics, ISSI 2019 - Proceedings. All rights reserved.
Movie Prism: A novel system for aspect level sentiment profiling of movies
(IOS Press, 2017) Rajesh Piryani; Vedika Gupta; Vivek Kumar Singh
This paper describes an integrated aspect level opinion summary generation system for movie reviews. The system, named as Movie Prism, analyses each movie review, locates aspect term in it, identifies opinion about those aspects and then generates a visual aspect based opinion summary of the movie in question. At present, the movie reviews and other related information is being automatically fetched from IMDb for all the movies released during the years 2010 to 2014. The system has an integrated crawler for this purpose. Further, ontology for the movie domain is created for better aspect identification. We have evaluated the system on three annotated movie review datasets. The system obtains good accuracy. Overall the designed system is capable of producing visual aspect level opinion summaries from unstructured textual reviews, without any need of training and results have a reasonable degree of accuracy. © 2017 - IOS Press and the authors. All rights reserved.
Open access levels and patterns in scholarly articles from India
(Indian Academy of Sciences, 2019) Rajesh Piryani; Jyoti Dua; Vivek Kumar Singh
Open access (OA) has emerged as an important movement worldwide during the last decade. There are several calls now that not only persuade researchers to publish in OA journals, to archive their pre- or post-print versions of papers in repositories, but also institutions and funding agencies to promote OA of research publications. This article examines OA levels and patterns in research output by computationally analysing research publication data obtained from the Web of Science for India during the last five years (2014-2018). Results obtained show that about 24% of research output from India, during the last five years, is available in OA compared to world average of about 30%. More articles are available in gold OA compared to green and bronze OA. Furthermore, OA levels vary in different disciplines, with medical science, physics and biology having higher percentage of their articles available as OA as compared to those like arts and humanities, social science and (surprisingly) information science. © 2019 Indian Academy of Sciences.
Preprint submissions by Indian scientists in arXiv
(Indian Academy of Sciences, 2020) Vivek Kumar Singh; Satya Swarup Srichandan; Rajesh Piryani
This note analyses the preprint submission patterns of Indian scientists to the popular preprint server arXiv. We analysed research papers published during a five-year period (2014-18) as indexed in the Web of Science and identified how many of these were deposited in arXiv. The discipline-wise distribution of research papers deposited in the repository was also analysed. Results show that overall, only about 3.5% of research papers were deposited in arXiv. The deposits, however, vary across disciplines, ranging from a high of about 23% for physics to a low of 0.4% for agricultural science and biology. We present the overall submission and download statistics for arXiv, and highlight the need for promoting the use of such repositories by the Indian scientific community. © Indian Academy of Sciences.
Revisiting subject classification in academic databases: A comparison of the classification accuracy of Web of Science, Scopus Dimensions
(IOS Press BV, 2020) Prashasti Singh; Rajesh Piryani; Vivek Kumar Singh; David Pinto
Classification of research articles into different subject areas is an extremely important task in bibliometric analysis and information retrieval. There are primarily two kinds of subject classification approaches used in different academic databases: journal-based (aka source-level) and article-based (aka publication-level). The two popular academic databases- Web of Science and Scopus- use journal-based subject classification scheme for articles, which assigns articles into a subject based on the subject category assigned to the journal in which they are published. On the other hand, the recently introduced Dimensions database is the first large academic database that uses article-based subject classification scheme that assigns the article to a subject category based on its contents. Though the subject classification schemes of Web of Science have been compared in several studies, no research studies have been done on comparison of the article-based and journal-based subject classification systems in different academic databases. This paper aims to compare the accuracy of subject classification system of the three popular academic databases: Web of Science, Scopus and Dimensions through a large-scale user-based study. Results show that the commonly held belief of superiority of article-based subject classification over the journal-based subject classification scheme does not hold at least at the moment, as Web of Science appears to have the most accurate subject classification. © 2020 - IOS Press and the authors. All rights reserved.
Scholarly article retrieval from Web of Science, Scopus and Dimensions: A comparative analysis of retrieval quality
(SAGE Publications Ltd, 2023) Prashasti Singh; Vivek Kumar Singh; Rajesh Piryani
Scholarly databases are now being increasingly used for search and retrieval of research articles in different subject areas. Several previous studies have shown that different databases vary in their coverage of publication sources, and therefore, one may expect that for a given query, they may retrieve different results. However, how do these databases compare in terms of relevance of the retrieved results is relatively unexplored. This study, therefore, attempts to bridge this research gap by carrying out a systematic study of retrieval relevance of the three scholarly databases – Web of Science, Scopus and Dimensions. Five selected queries are used for this purpose. The retrieved results from the three databases for the given queries are first analysed in terms of volume of retrieved records, language of retrieved records, etc. Thereafter, a user-based annotation scheme is used to assess and compare the relevance of retrieved results. The standard measure of normalised discounted cumulative gain (NDCG) and Spearman rank correlation coefficient (SRCC) is computed for the purpose. Results indicate that although the number of retrieved results for the same query differs significantly in the three databases, the databases differ only marginally in retrieval relevance, with Web of Science having a slight edge over other two. © The Author(s) 2023.
Sentiment analysis in Nepali: Exploring machine learning and lexicon-based approaches
(IOS Press BV, 2020) Rajesh Piryani; Bhawna Piryani; Vivek Kumar Singh; David Pinto
In recent times, sentiment analysis research has achieved tremendous impetus on English textual data, however, a very less amount of research has been focused on Nepali textual data. This work is focused towards Nepali textual data. We have explored machine learning approaches and proposed a lexicon-based approach using linguistic features and lexical resources to perform sentiment analysis for tweets written in Nepali language. This lexicon-based approach, first pre-process the tweet, locate the opinion-oriented features and then compute the sentiment polarity of tweet. We have investigated both conventional machine learning models (Multinomial Naïve Bayes (NB), Decision Tree, Support Vector Machine (SVM) and logistic regression) and deep learning models (Convolution Neural Network (CNN), Long Short-Term Memory (LSTM) and CNN-LSTM) for sentiment analysis of Nepali text. These machine learning models and lexicon-based approach have been evaluated on tweet dataset related to Nepal Earthquake 2015 and Nepal blockade 2015. Lexicon based approach has outperformed than conventional machine learning models. Deep learning models have outperformed than conventional machine learning models and lexicon-based approach. We have also created Nepali SentiWordNet and Nepali SenticNet sentiment lexicon from existing English language resources as by-product. © 2020 - IOS Press and the authors. All rights reserved.
The case of significant variations in gold–green and black open access: evidence from Indian research output
(Springer Netherlands, 2020) Vivek Kumar Singh; Rajesh Piryani; Satya Swarup Srichandan
Open Access has emerged as an important movement worldwide during the last decade. There are several initiatives now that persuade researchers to publish in open access journals and to archive their pre- or post-print versions of papers in repositories. Institutions and funding agencies are also promoting ways to make research outputs available as open access. This paper looks at open access levels and patterns in research output from India by computationally analyzing research publication data obtained from Web of Science for India for the last 5 years (2014–2018). The corresponding data from other connected platforms—Unpaywall and Sci-Hub—are also obtained and analyzed. The results obtained show that about 24% of research output from India, during last 5 years, is available in legal forms of open access as compared to world average of about 30%. More articles are available in gold open access as compared to green and bronze. On the contrary, more than 90% of the research output from India is available for free download in Sci-Hub. We also found disciplinary differentiation in open access, but surprisingly these patterns are different for gold–green and black open access forms. Sci-Hub appears to be complementing the legal gold–green open access for less covered disciplines in them. The central institutional repositories in India are found to have low volume of research papers deposited. © 2020, Akadémiai Kiadó, Budapest, Hungary.
The status and patterns of open access in research output of most productive indian institutions
(Phcog.Net, 2020) Satya Swarup Srichandan; Rajesh Piryani; Vivek Kumar Singh; Sujit Bhattacharya
Open Access is emerging as an important movement worldwide since last few years, triggered mainly by the high subscription cost of pay walled journals that create barriers in universal dissemination of knowledge reported in those journals. The paywall barriers to access of knowledge has become so problematic that even institutions in the developed countries are not only cancelling subscriptions but also mandating it for their researchers to either publish in open access journals or at least deposit their research papers in Institutional Repositories. The high subscription cost of journals is a more serious issue for developing countries, as it takes away institutional resources that can be used for other productive purposes. India has taken several steps in promoting open access, including release of an open access policy by Ministry of Science and Technology, however, it is not very clear that how effective these initiatives have been. This paper intends to address this issue. It examines published output, indexed in Web of Science, from 100 most productive institutions in India and analyze how much research output coming from them are available in Open Access (OA). The paper further analyzes availability of research papers from these institutions in the popular pirate site Sci-Hub. It is interesting to observe that legal OA percentages are significantly lesser than the Sci-Hub availability for all the institutions, an indication that the existing systems for promoting open access in India are not working efficiently. At the end, the paper also presents statistics about number of papers deposited in three central institutional repositories in India. These statistics provides an indication of the extent to which these repositories have been able to promote open access in India. The paper concludes by pointing to some factors that impede Open Access in India. © 2020 Society for Industrial and Applied Mathematics.