ANALYSING LIBRARY USAGE PATTERNS: A VISUAL EXPLORATION OF BOOK LOAN AND ROOM RESERVATION TRENDS

Library usage refers to how users interact with library resources and spaces, including borrowing books, reserving study rooms, navigating library spaces, and using digital resources. Analysing library usage patterns helps libraries optimise resources, improve services, and better meet the needs of their users. This study aims to visually explore the patterns in book loan activities and room reservation behaviours using the R programming language to identify trends that provide preliminary data for decision-making when establishing library operational policies. Annual book loan and room reservation data from 2017 to 2023 from Perpustakaan Sultanah Bahiyah, Library of Universiti Utara Malaysia, were collected and converted into a compatible R-analysis format. The analysis includes exploratory data analysis and visualisation using base R functions, base R graphics, and ggplot2 functionality. The findings, such as annual book loan rates, trends in book loan rates between undergraduate and postgraduate students, frequency of room reservation discussions across years, and the relationship between book loans and room reservations over time, serve as foundational data to support decision-making for the efficient operation of libraries and to inform strategies for optimising library services and resources to meet user needs better.


INTRODUCTION
The ability to manage, comprehend, and derive insights from vast datasets is important in our rapidly evolving world.This skill is not exclusive to businesses; researchers and policymakers rely on it.As data

JOURNAL OF DIGITAL SYSTEM DEVELOPMENT e-journal.uum.edu.my/index.php/jdsd
The limited availability of literature on data visualization techniques specifically focused on analyzing and interpreting library usage and services suggests that this area of study is relatively underexplored or less documented compared to other topics within library science or data analytics.The next section of this review introduces R and highlights its implementation in other domains, providing valuable knowledge and inspiration that can be leveraged to enhance data analysis, visualization, and decisionmaking practices within library usage and service.R was originally introduced by Robert Gentleman and Ross Ihaka, members of the statistical department at the University of Auckland in New Zealand, back in 1996.From 1997 onwards, the "R Core Team" took charge of its ongoing development.Notably, R is designed to function seamlessly across various operating systems, including Unix, Linux, Windows, and MacOS (Mestiri, 2019).The R programming language is a powerful and versatile tool widely used for statistical computing and data analysis tasks in various fields including healthcare (Islam et al., 2023;Konopka et al., 2018;Markazi-Moghaddam et al., 2020), amateur radio spot analysis (Chris, 2020), floral and geographical studies (Sichilongo et al., 2020), sentiment analysis of academic library tweets (Lund, 2020), and nutritional analysis (Vignesh & Nagaraj, 2022).Islam et al. (2023) conducted exploratory data analysis using R Studio, focusing on a real-time dataset containing vital information regarding patients diagnosed with cardiovascular disease.The R packages used were ggplot2, tidyr, and dplyr.The data was efficiently processed and visually represented, aiding in drawing meaningful conclusions.Konopka et al. (2018) implemented a comprehensive toolkit tailored for thorough exploratory analysis of clinical datasets and conducted a case study analysis on a dataset focusing on patients.The aim was to explore the connections between health, genetics, and social status in the elderly population of Poland.The exploratory analysis primarily focused on numerical attributes and followed these steps: 1) normalization, 2) Principal Component Analysis (PCA), 3) detection and removal of outliers, and 4) clustering.The prcomp function from the base R package was employed for performing principal component analysis.Markazi-Moghaddam et al. (2020) investigated the time intervals associated with the operating theater of a general hospital through a cross-sectional study utilizing data from patients undergoing elective surgery.The analysis commenced with data summarization and the identification of outliers and anomalies.The study examined whether significant differences existed between surgery categories concerning the duration of stay in the operating theater.In instances where the prerequisites for parametric tests were not met, the study employed a robust alternative to ANOVA (utilizing the R package WRS -Wilcox' Robust Statistics) by employing 20% trimmed means.Furthermore, robust posthoc tests were conducted to ascertain differences between the means of all potential pairs of surgery categories.Addressing significant variations in group sizes among the surgery categories, a balanced dataset was created to assess potential improvements in results.The R package UBL was utilized for multi-class balancing.Chris (2020) conducted a study on Amateur radio spots utilizing records from the Reverse Beacon Network (RBN) to identify patterns in various variables.A "spot" is defined as a situation where there exists a propagation path between a transmitter and a receptor location at a specific time and frequency.The distance between the transmitter and receptor is computed using the distHaversine function in R, which calculates the shortest distance between two points on a sphere based on their latitude and longitude.A tool was developed to filter data for a particular unique transmitter city and visually represent activities at that specific location.This tool was constructed using functions and tools from libraries such as maps, ggplot2, tidyverse, and plyr.Sichilongo et al. (2020) employed a combination of the Metab R package along with other tools such as Automated Mass Spectral Deconvolution and Identification System (AMDIS), as well as MINITAB® statistical analysis software.Their objective was to classify the floral and geographical origins of three randomly selected commercially produced and three unprocessed natural organic honeys from Zambia and Botswana.They utilized gas chromatography mass spectrometry (GC-MS) untargeted metabolomics of volatile components for this purpose.The Metab R package was utilized to adjust peak intensities, present peak areas, eliminate false positives, perform normalization by internal standard and biomass, as well as conduct statistical tests such as analysis of variance (ANOVA) and t-tests on data generated by AMDIS for metabolomics, both targeted and non-targeted.Vignesh & Nagaraj (2022) applied Exploratory Data Analysis (EDA) to a nutritional dataset, employing visualization packages in R such as bar charts, histograms, box plots, scatterplots, barcodes, and violin plots.
In summary, the aim of this study to implement R programming for analysing and interpreting library usage of book loans and room discussion bookings is significant due to the existing gap in the literature and the practical relevance to library operations.

METHODOLOGY
The methodology section of this study outlines the procedures employed to conduct a visual analysis of the annual book loan and room discussion reservation data from Perpustakaan Sultanah Bahiyah, Library of Universiti Utara Malaysia.The procedures consist of data collection, preparation, exploratory data analysis (EDA), data visualisation, interpretation of findings, and discussion of implications for decisionmaking.
Data collection involved gathering annual book loan and room discussion reservation data from Perpustakaan Sultanah Bahiyah, Library of Universiti Utara Malaysia, spanning from 2017 to 2023.Subsequently, the collected data underwent preparation, in which the data was converted from Excel format to a format that was suitable for analysis in R.
After data preparation, exploratory data analysis (EDA) was conducted to understand the characteristics and patterns present in the data.This involved calculating summary statistics and exploring relationships between variables.Based on these relationships, research questions were formulated to investigate the observed patterns further.1.How do annual book loan rates vary across different academic years within the population? 2. How do trends in annual book loan rates differ between undergraduate and postgraduate students within the library? 3. Are there noticeable differences in the frequency of room discussion bookings for various academic years?4. Are there significant fluctuations in room types of reservation frequencies between different academic years? 5. How does the number of books borrowed from the library relate to the frequency of room discussion reservations over time?
The next step is to utilise the R programming language and its associated libraries, namely base R graphics and ggplot2, to create visualisations that aim to address the research questions.ggplot2 is a powerful and widely used package in R for creating graphics and data visualisations (Wickham, 2016).It provides a flexible and intuitive syntax for building various plots, including scatter plots, bar plots, line plots, histograms, and more.The use of visual representations such as bar plots, line plots and scatter plots facilitates a clearer comprehension of each research question.
Upon the generation of visualisations, the findings were also interpreted, focusing on key metrics such as annual book loan rates, trends across different borrower categories, frequency of room discussion reservations over time, and any correlations between book loans and room reservations.These interpretations were contextualized within the framework of library operations and decision-making processes.

DATA PREPARATION
The

EXPLORATORY DATA ANALYSIS
The structure and class of the dataset were observed with the str() function, while the glimpse of the whole dataset can be viewed using head() and tail(() function which shows the first and last 6 rows of the dataset.Figure 1 shows the structure and several rows of the dataset.It is shows that loanbook has 84 objects (rows) and 15 variables (columns) with its data type consists of Tahun (integer), Bulan (Factor), Ijazah Pertama (integer), Pelajar Jarak Jauh (integer), Master Full Time (integer), Master Part Time (integer), PhD Full Time (integer), PhD Part Time (integer), Ijazah Pertama UUMKL (integer), Master UUMKL (integer), PhD UUMKL (integer), Pinjaman Antara Perpustakaan (integer), Akademik (integer), Pentadbiran (integer) and Pesara (integer).Meanwhile, appointment has 84 objects (rows) and 4 variables (columns) with its data type consisting of Tahun (integer), Bulan (character), TEMPAHAN_BILIK_PERBINCANGAN (integer) and TEMPAHAN_KAREL_TERTUTUP (integer).

Figure 1 R Result on Viewing the Structure and Several Rows Data of Dataset
Based on the research questions determined earlier, next step, we examined various factors in the dataset and visualized them using both base R graphics and ggplot2 functionality to see how they relate to each other.The plots obtained in the study help us to see the patterns and trends in the data more clearly.The description of this process will be detailed in the next section.

DATA VISUALIZATION AND ANALYSIS
This section will address the research questions stated in the methodology section using base R function, base R graphics and ggplot2 functionality.

How do annual book loan rates vary across different academic years within the population?
To get the result, firstly we total up all the value for each column in loanbook dataset based on Tahun (Year) variable except column Bulan using codes as shown in Figure 2. The aggregate() function is used to calculate the sum of numeric variables in a dataset loanbook grouped by the variable Tahun (Year).The dot (.) means to apply the aggregation function to all numeric columns in the dataset, while ~ Tahun (Year) specifies that we want to aggregate the data based on the variable Tahun (Year).The code data = loanbook [, -2] specifies the dataset to be used for aggregation, which is loanbook.However, [, -2] part suggests that the second column of the loanbook dataset is excluded from the aggregation.This typically means that the second column contains non-numeric data or it is a column that we do not want to include in the aggregation.The code sum specifies the aggregation function to be used, which is sum in this case.
It tells R to calculate the sum of each numeric variable for each group defined by the Tahun (Year) variable and stores the result in a new data frame called total_by_year.In the second line, the code total_by_year will display the content of the total_by_year variable, showing the result of the aggregation with the total number of user type for each year from 2017 until 2023 (Figure 3).

Bar Plot with Number of book loan based on Year and User Types
In Figure 5, the borrowing activity shows a consistent decrease from year 2017 to 2021 for all types of users.These could be due to shifts in user behavior, includes increase in borrowing of e-books compared to physical books and preference for online resources or remote services.In addition, the COVID-19 pandemic has had a profound impact on how people engage with libraries.With restrictions on physical access to libraries and a heightened emphasis on social distancing, there has been a significant increase in demand for digital resources.Users are relying more on e-books, online databases, and digital collections for research, leisure reading, and academic purposes.However, there is slightly increasing trends in borrowings activity from 2022 until 2023.As UUM reopens after the pandemic and public health restrictions ease, users may seek out in-person activities, including visiting libraries and borrowing books.The reopening of the library provides users with access to physical collections, browsing opportunities, and in-person assistance from library staff, which may lead to an uptick in borrowing activity.While digital resources and online services played a significant role during the pandemic, there remains a distinct appeal to the physical experience of browsing shelves and discovering new books in libraries.
Based on the insights gleaned from the bar plot, UUM library administrators might prioritize acquiring more electronic resources or investing in technologies that support e-book lending, online databases, virtual reference services, or remote access to digital collections.In addition, UUM Library can offer various services beyond borrowing, such as research assistance, workshops, events, and maker spaces.Tracking the attendance or participation rates in these activities can help identify emerging interests or needs among library users.For instance, a surge in attendance at technology workshops may indicate growing interest in digital literacy skills, prompting the library to expand its technology-related programs.UUM Library can respond by enhancing their online presence, providing responsive online assistance, and fostering virtual communities.Next, UUM Library can also invest in user-friendly self-service kiosks, mobile apps, or online platforms that empower users to independently manage their library accounts and access resources.
2. How do trends in annual book loan rates differ between undergraduate and postgraduate students within the library?Undergraduate (UG) and postgraduate (PG) students often represent two major user segments in academic libraries.Understanding how book loan rates differ between these groups provides valuable insights into the borrowing behaviors and needs of the primary users of the library resources.To obtain the result for this research question, a line plot with months of the year on the x-axis and the number of book loans on the y-axis was created.Each line on the graph represents UG and PG categories, showing how their borrowing behaviors vary throughout the year.
In order to create the line plot, firstly, a ggplot2 library was loaded.Next, the loanbook dataset was separated into data frames based on year which named as loanbook2017, loanbook2018, loanbook2019, loanbook2020, loanbook2021, loanbook2022 and loanbook2023.Subsequently, several other data frames were created respectively based on these data frames that contain four variables such as Tahun (Year), Bulan, UG and PG.UG contains total number of borrowings for UG that sum the values in columns 2, 3 and 8 corresponding to UG variables such as Ijazah Pertama, Pelajar Jarak Jauh and Ijazah Pertama UUMKL respectively.PG contains total number of borrowings for PG that sum the values in columns 4, 5, 6, 7, 9 and 10 corresponding to PG variables to Master Full Time, Master Part Time, PhD Full Time, PhD Part Time, Master UUMKL and PhD UUMKL respectively.These data frames were named as UGPG2017, UGPG2018, UGPG2019, UGPG2020, UGPG2021, UGPG2022 and UGPG2023.
Next, each of UGPG2017, UGPG2018, and so on were reshaped from wide to long format and assigned to new variable.This is to allows for easier comparison and visualization of multiple variables over time.The Bulan variable was converted to a factor with the desired order of levels.This ensures that the months are plotted in the correct order on the x-axis of the plot.To create a line plot, ggplot() function was used, where Bulan (Month) is plotted on the x-axis, Frequency_of_Borrowings is plotted on the y-axis, and different colors are used to distinguish between UG and PG student borrowings (color = Student_Types).The theme_minimal() function sets a minimal theme for the plot.Figure 6 shows sample codes for the year 2017 data on overall steps mentioned in this and previous paragraph.Figure 7 shows the line plots for frequency of books loan for the year 2017 until 2023 between UG and PG students.

Figure 7
Line Plots for Frequency of Books Loan between UG and PG students for year 2017 until 2023 UUM library administrators could consider reallocating resources based on the borrowing trends.Since UG students generally borrow more books, the library may need to ensure sufficient stock and availability of materials that cater to their needs.This might involve expanding collections in areas relevant to UG courses or popular topics.Recognizing the exceptions where PG borrowing peaks, the library administrators could implement targeted promotional campaigns to encourage more borrowing among PG students during those months.This could involve showcasing relevant resources, offering incentives, or organizing events tailored to PG students' interests or academic requirements.To accommodate fluctuations in borrowing activity, the library administrators may need to adjust service hours, staffing levels, or loan policies.For example, during peak borrowing periods, they could extend operating hours or provide additional assistance to manage increased demand.Since borrowing activity reflects academic schedules and study habits, the administrators could also review the allocation of study spaces within the library.They might consider adjusting seating arrangements or providing designated areas that cater to the specific needs of UG and PG students during peak usage times.
3. Are there noticeable differences in the frequency of room discussion bookings for various academic years?
To get the result from this research question, a bar plot like the one described in the first research question above was created using appointment dataset.Similar codes are also used to total up all the value for each column in appointment dataset based on Tahun (Year) variable that excludes column Bulan and stores the result in a new data frame called totalApp_by_year as shown in Figure 8.In the second line, the code totalApp_by_year will display the content of the totalApp_by_year variable, showing the result of the aggregation with the total number of rooms discussion (BILIK PERBINCANGAN and KAREL TERTUTUP) reservations for each year from 2017 until 2023 as shown in Figure 9.By examining the heights of the bars for each room type across different years in Figure 12, it was consistent of user preferences over BILIK PERBINCANGAN from year 2017 until 2019.This is perhaps due to the increased emphasis on collaborative learning approaches in educational institutions that may drive higher demand for discussion rooms where students can engage in group study sessions, project work, or collaborative research.In addition, students may require dedicated spaces for team meetings, brainstorming sessions, or project discussions, leading to a preference for discussion rooms equipped with collaborative tools and technology.However, even though the number of both types of room reservation decreased in the year 2020 due to the influence of COVID-19, there is a steady increase reservation for KAREL TERTUTUP.The reason perhaps is that some users may prefer KAREL TERTUTUP for individual study, research, or focused work, as these spaces offer privacy and minimal distractions.During exam periods, students may seek out KAREL TERTUTUP as quiet study spaces conducive to concentration and exam preparation, leading to increased demand for these facilities.The movement control order due to COVID-19 have influenced the closing of university contribute to null reservation for both type of rooms in the year 2021.The reopening of UUM library provides users with access to physical services which indicate number of reservation or both type of rooms in the years of 2022 and 2023.Even though after the COVID-19 pandemic, it was consistent of user preferences and needs over BILIK PERBINCANGAN compared to KAREL TERTUTUP.Based on these insights, UUM library may consider expanding the number of available BILIK PERBINCANGAN or adjusting booking policies to better accommodate user's demand.

Are there significant fluctuations in room types of reservation frequencies between different academic years?
To address this research question, a line plot with months of the year on the x-axis and the frequency of room reservation on the y-axis was created.Each line on the graph represents frequency of room reservation of a different room type, showing how room reservation frequencies fluctuate over time.
In order to create the line plot, similar to steps in addressing second research question earlier, firstly, a ggplot2 library was loaded.Next, the appointment dataset was separated into data frames based on year which named as appointment2017, appointment2018, appointment2019, appointment2020, appointment2021, appointment2022 and appointment2023.Next, each of appointment2017, appointment2018, and so on were reshaped from wide to long format and assigned to new variable.The Bulan variable was converted to a factor with the desired order of levels.This ensures that the months are plotted in the correct order on the x-axis of the plot.To create a line plot, ggplot() function was used, where Bulan (Month) is plotted on the x-axis, Frequency_of_Reservations is plotted on the y-axis, and different colors are used to distinguish between UG and PG student borrowings (color = Room_Types).The theme_minimal() function sets a minimal theme for the plot.Figure 12 shows sample codes for the year 2017 data on overall steps for creating the line plot.Figure 13 shows the line plots for frequency of rooms reservation for the year 2017 until 2023 of different room types.

Line Plots for Frequency of Room Reservation between BILIK PERBINCANGAN and KAREL TERTUTUP for year 2017 until 2023
In general, the patterns observed in reservations for both "BILIK PERBINCANGAN" and "KAREL TERTUTUP" exhibit fluctuations throughout the years 2017 to 2023.Both categories experience varying levels of activity, with some periods showing increases while others show decreases.Exceptionally, no reservation at all for both room types for the year 2021 due to restrictions on physical access to libraries.
UUM library administrators could implement a more flexible reservation system that accommodates fluctuations in room bookings.This could involve allowing for shorter notice periods for reservations or implementing dynamic pricing strategies to incentivize bookings during periods of low demand.To mitigate the impact of fluctuations in room reservations, the library administrators could enhance communication with users to raise awareness of room availability and encourage bookings during quieter periods.This could include promoting the benefits of using library facilities for group discussions or private study sessions.During periods of low room reservations, the library administrators could explore alternative uses for these spaces.For example, they could repurpose the rooms for events, workshops, or collaborative activities that align with the library's objectives and user needs.Recognizing the impact of physical access restrictions on room reservations, the library administrators could invest in virtual collaboration tools and platforms.This would enable users to engage in group discussions or collaborative work remotely, providing an alternative solution during periods when physical access to library spaces is limited.The administrators may need to review existing reservation policies and procedures to ensure they remain responsive to changing user needs and circumstances.This could involve revising cancellation policies, extending reservation durations, or introducing tiered reservation systems to accommodate varying levels of demand.Learning from the experience of no reservations in 2021 due to physical access restrictions, the administrators could develop contingency plans and protocols to manage similar situations in the future.This could involve establishing guidelines for remote access to library resources and services, ensuring continuity of support for users during challenging circumstances.
5. How does the number of books borrowed from the library relate to the frequency of room discussion reservations over time?
To respond to this research question, a data frame named loanAppt was created by combining information from loanbook and appointment dataset.The information of loanbook dataset consists of the total number of book loans made by summing up columns 3 to 11, while the information of appointment dataset consists of the total number of room reservations by summing up columns 3 and 4. In addition, the column of Tahun and Bulan of loanbook dataset were also included in the loanAppt data frame.Figure 14 shows the codes for this step.In the second line, the code loanAppt will display the content of the loanAppt variable, as shown in Figure 15, showing the result of the data frame with 4 columns and 84 rows.Column 3 consists of the sum of overall sum of loan books for UG and PG students, while column 4 contain the total of both room reservations (BILIK PERBINCANGAN and KAREL TERTUTUP).Next, we calculated the covariate (covariance) and correlation to assess the strength and direction of the relationship between the number of books borrowed and the frequency of room reservations as shown in Figure 16.A positive covariance of 191636.7 suggests that as the number of books borrowed increases, the frequency of room reservations also tends to increase, while value 0.57 of correlation indicates a positive linear relationship between these two variables.To address this research question, we also carried out simple linear regression models to describe the effect of total number of books borrowed might have on the frequency of room reservations.Figure 17 shows the codes and results on performing the simple linear regression using lm() function and viewing the report summary of the regression model using summary() function.Overall, the result in Figure 17 provides information about the coefficients, significance of the predictors, goodness of fit, and overall significance of the model.It suggests that the frequency of room reservations has a statistically significant effect on the total number of books borrowed, as indicated by the low p-value (1.519e-08).In addition, we plot the simple linear regression models using plot() function and abline() function that will draw the regression line as shown in Figure 18.The result of scatter plot and regression line (Figure 19) shows an upward line slope which suggests a positive correlation between the number of rooms reserved and the number of books borrowed.

Scatterplot and Regression Line between Number of Books Borrowed with Frequency of Room Reservations
The library administrators could consider allocating resources based on the observed correlation.If there is a significant increase in the number of books borrowed corresponding to an increase in room reservations, the library may need to ensure a sufficient stock of materials to meet the anticipated demand.Recognizing the positive correlation, the administrators could optimize the utilization of library spaces to accommodate both study and collaborative activities.This could involve prioritizing room reservations for group discussions or study sessions during peak borrowing periods to enhance the overall user experience.The administrators could implement targeted promotional activities to encourage room reservations and borrowing activities concurrently.For example, they could offer incentives or discounts for users who reserve rooms for group study sessions or provide recommendations for relevant resources available for borrowing.Continuously monitoring and analyzing the correlation between room reservations and book borrowings allows the administrators to make informed decisions regarding resource allocation, service provision, and strategic planning.Regular assessment of these patterns enables the administrators to adapt and optimize library services to better meet the evolving needs of library users.Leveraging the positive correlation, the administrators could enhance the overall user experience by providing integrated services that facilitate both room reservations and access to relevant library resources.This could involve developing user-friendly platforms or mobile applications that streamline the process of booking rooms and borrowing books.Recognizing the potential for collaboration between room reservations and borrowing activities, the administrators could explore partnerships with academic departments or student organizations to promote joint initiatives that utilize library spaces and resources effectively.

CONCLUSION
In conclusion, through a thorough examination of annual book loan rates across different academic years and between undergraduate and postgraduate students, as well as an analysis of room discussion booking frequencies over time, this study has provided valuable insights into the dynamics of library usage within academic settings.The findings reveal variations in books borrowing and room reservation patterns across different academic years, indicating potential shifts in student preferences, academic demands, and resource utilization trends.Additionally, the study highlights correlations between the number of books borrowed and the frequency of room discussion reservations, suggesting interdependencies between these two aspects of library usage.Moving forward, further research avenues offer opportunities to deepen our understanding of these dynamics and inform strategies for optimizing library services and facilities to better meet the evolving needs of users.Future direction could include developing predictive models to forecast future fluctuations in room types of reservation frequencies between different academic years, conducting a behavioral analysis to understand how the number of books borrowed from the library relates to the frequency of room discussion reservations over time and finally, investigating the impact of technology, such as online reservation systems or digital resource access, on borrowing behavior and room reservation frequencies over time.

Figure 4 R
Figure 4

Figure 6 R
Figure 6

Figure 10 R
Figure 10

Figure 18 R
Figure18 datasets that were collected consist of two Excel file formats (.xlsx) named Pinjaman Buku Mengikut Kategori Pengguna 2017 -2023 and Tempahan Bilik Karel dan Perbincangan.However, these files are not directly compatible with R.These files were converted to a table format, such as Comma-separated values (CSV).Next, both of CSV files were imported using the read.csv()function in the R environment, then a variable was assigned to each of this CSV file namely as loanbook for Pinjaman Buku Mengikut Kategori Pengguna 2017 -2023.csvand appointment for Tempahan Bilik Karel dan Perbincangan.csv.