The Internet is full of content that anyone can access, but just becuase you can see it doesn't mean you can download the content and turn it into a dataset. This kind of activity is considered "reuse." It is important to always check and understand the copyright status and license (when applicable) for any restrictions on reuse of any text you want to mine. Librarians are great resources to consult when looking for resources to use for text mining.
Why? You're stealing protected information, and that violation of Terms of Service can get access turned off for the entire campus.
Always ask your library liaison if you don't know how to access the underlying data in a library database -- we can help you do it legally! Depending on the database vendor, we may be able to get hard drives of all the underlying data, or we can help to set up API access for you.
Questions? Contact email@example.com
Powered by Springshare.