A slow version of the Rapid Automatic Keyword Extraction (RAKE) algorithm
You can get the stable version from CRAN:
install.packages("slowraker")Or the development version from GitHub:
if (!require(devtools)) install.packages("devtools")
devtools::install_github("crew102/slowraker")There is one main function in the slowraker package -
slowrake(). slowrake() extracts keywords from
a vector of documents using the RAKE algorithm. This algorithm doesn’t
require any training data, so it’s super easy to use:
library(slowraker)
data("dog_pubs")
rakelist <- slowrake(txt = dog_pubs$abstract[1:5])#> Warning: package 'slowraker' was built under R version 3.4.2slowrake() outputs a list of data frames. Each data
frame contains the keywords that were extracted for an element of
txt:
rakelist
#> 
#> # A rakelist containing 5 data frames:
#>  $ :'data.frame':    61 obs. of  4 variables:
#>   ..$ keyword:"assistance dog identification tags" ...
#>   ..$ freq   :1 1 ...
#>   ..$ score  :11 ...
#>   ..$ stem   :"assist dog identif tag" ...
#>  $ :'data.frame':    90 obs. of  4 variables:
#>   ..$ keyword:"current dog suitability assessments focus" ...
#>   ..$ freq   :1 1 ...
#>   ..$ score  :21 ...
#>   ..$ stem   :"current dog suitabl assess focu" ...
#> #...With 3 more data frames.You can bind these data frames together using
rbind_rakelist():
rakedf <- rbind_rakelist(rakelist = rakelist, doc_id = dog_pubs$doi[1:5])
head(rakedf, 5)
#>                         doc_id                            keyword freq score
#> 1 10.1371/journal.pone.0132820 assistance dog identification tags    1  10.8
#> 2 10.1371/journal.pone.0132820          animal control facilities    1   9.0
#> 3 10.1371/journal.pone.0132820          emotional support animals    1   9.0
#> 4 10.1371/journal.pone.0132820                   small body sizes    1   9.0
#> 5 10.1371/journal.pone.0132820       seemingly inappropriate dogs    1   7.9
#>                       stem
#> 1   assist dog identif tag
#> 2       anim control facil
#> 3        emot support anim
#> 4          small bodi size
#> 5 seemingli inappropri dogslowrake(), check out the “Getting started” vignette
(vignette("getting-started")). Frequently asked questions
are answered in the FAQs vignette (vignette("faqs")).