This data is available in the openintro
package as the data frame email
.
These data represent incoming emails for the first three months of 2012 for David Diez’s Gmail Account, early months of 2012. All personally identifiable information has been removed.
A data frame with 3921 observations on the following 21 variables.
Variable | Description |
---|---|
spam |
Indicator for whether the email was spam. |
to_multiple |
Indicator for whether the email was addressed to more than one recipient. |
from |
Whether the message was listed as from anyone (this is usually set by default for regular outgoing email). |
cc |
Indicator for whether anyone was CCed. |
sent_email |
Indicator for whether the sender had been sent an email in the last 30 days. |
time |
Time at which email was sent. |
image |
The number of images attached. |
attach |
The number of attached files. |
dollar |
The number of times a dollar sign or the word âdollarâ appeared in the email. |
winner |
Indicates whether âwinnerâ appeared in the email. |
inherit |
The number of times âinheritâ (or an extension, such as âinheritanceâ) appeared in the email. |
viagra |
The number of times âviagraâ appeared in the email. |
password |
The number of times âpasswordâ appeared in the email. |
num_char |
The number of characters in the email, in thousands. |
line_breaks |
The number of line breaks in the email (does not count text wrapping). |
format |
Indicates whether the email was written using HTML (e.g. may have included bolding or active links). |
re_subj |
Whether the subject started with âRe:â, âRE:â, âre:â, or ârE:â |
exclaim_subj |
Whether there was an exclamation point in the subject. |
urgent_subj |
Whether the word âurgentâ was in the email subject. |
exclaim_mess |
The number of exclamation points in the email message. |
number |
Factor variable saying whether there was no number, a small number (under 1 million), or a big number. |
This information was copied from ?email
on March 20, 2016