Chatroom Spam Prediction

Chatroom Spam Prediction

This method will allow you to predict if the given input (preferably from a chatroom or message board) is spam, this works by comparing the input to an existing dataset. This method works well with "English" as the default language input but you can also have CoffeeHouse translate the given text into English if it helps bring better results.

The input of your data is limited by your subscription and a larger input can take longer to process

Parameter NameDefault ValueRequiredDescription
inputNULLTrueThe given input to process, the size of the input is limited by your subscription
languageenFalseThe language the input is based in, if the given language is not english then CoffeeHouse will attempt to translate the input to english before processing
sentence_split0FalseSplits the results into sentences
generalize0FalseGeneralize the results using a generalization table
generalization_sizeNULLFalseThe size of the generalization table to create
generalization_idNULLFalseThe ID of the generalization table to use

Example Success Response (Without sentence_split)

{
  "success": true,
  "response_code": 200,
  "results": {
    "text": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system. expound the actual teachings of the great explorer of the truth, the master-builder of human happiness.",
    "source_language": "en",
    "spam_prediction": {
      "is_spam": false,
      "prediction": 96.3785,
      "predictions": {
        "ham": 96.3785,
        "spam": 3.5341400000000003
      }
    },
    "generalization": null
  }
}

Example Success Response (With sentence_split)

{
  "success": true,
  "response_code": 200,
  "results": {
    "text": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system. expound the actual teachings of the great explorer of the truth, the master-builder of human happiness.",
    "source_language": "en",
    "spam_prediction": {
      "is_spam": false,
      "prediction": 99.3571885,
      "predictions": {
        "ham": 99.3571885,
        "spam": 0.6332131785
      }
    },
    "sentences": [
      {
        "text": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system.",
        "offset_begin": 0,
        "offset_end": 156,
        "spam_prediction": {
          "is_spam": false,
          "prediction": 98.73317,
          "predictions": {
            "ham": 98.73317,
            "spam": 1.2477244
          }
        }
      },
      {
        "text": "expound the actual teachings of the great explorer of the truth, the master-builder of human happiness.",
        "offset_begin": 157,
        "offset_end": 260,
        "spam_prediction": {
          "is_spam": false,
          "prediction": 99.98120700000001,
          "predictions": {
            "ham": 99.98120700000001,
            "spam": 0.018701956999999998
          }
        }
      }
    ],
    "generalization": null
  }
}

Example Success Response (With generalization)

{
  "success": true,
  "response_code": 200,
  "results": {
    "text": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system. expound the actual teachings of the great explorer of the truth, the master-builder of human happiness.",
    "source_language": "en",
    "spam_prediction": {
      "is_spam": false,
      "prediction": 96.3785,
      "predictions": {
        "ham": 96.3785,
        "spam": 3.5341400000000003
      }
    },
    "generalization": {
      "id": "a8c1874cfa4e1d64867b9868450f18104085d13a89140a93ecb015df2bad0c8b",
      "size": 20,
      "top_label": "ham",
      "top_probability": 96.3785,
      "probabilities": [
        {
          "label": "ham",
          "calculated_probability": 96.3785,
          "current_pointer": 0,
          "probabilities": [
            96.3785
          ]
        },
        {
          "label": "spam",
          "calculated_probability": 3.5341400000000003,
          "current_pointer": 0,
          "probabilities": [
            3.5341400000000003
          ]
        }
      ]
    }
  }
}

Response Structure

NameTypeDescription
textstringThe text of the input, or the translation results if the input is from another language
source_languagestringThe original language source
spam_predictionSpamPredictionSpamPrediction object that represents the spam prediction values
sentencesSpamPredictionSentence[]Array of sentence splits containing SpamPrediction values, this will be returned if you use sentence_split
generalizationGeneralization|nullThe generalization results if generalization is used

SpamPrediction Object Structure

NameTypeDescription
is_spambooleanIndicates if the results are spam or not by comparing if the spam value is greater than the ham value
predictionfloatThe prediction value of the spam/ham value, spam if is_spam is true, ham if is_spam is false
predictionsarrayString:Float combination of all the predictions, includes ham and spam as key values

SpamPredictionSentence Object Structure

NameTypeDescription
textstringThe text of the sentence
offset_beginintThe character offset begin of the sentence
offset_endintThe character offset end of the sentence
spam_predictionSpamPredictionSpamPrediction object that represents the spam prediction values for this sentence

Generalization Labels

This method supports generalization and will use the following labels for generalization, for more information on how generalization works see Generalization - Introduction
Label
ham
spam

Invalid Language Code Response

This response is given when the parameter language contains an invalid value.
{
  "success": false,
  "response_code": 400,
  "error": {
    "error_code": 7,
    "type": "CLIENT",
    "message": "The given language 'not a real language' cannot be identified"
  }
}

Invalid Language Code Response

This response is given when the parameter language contains an invalid value.
{
  "success": false,
  "response_code": 400,
  "error": {
    "error_code": 7,
    "type": "CLIENT",
    "message": "The given language 'not a real language' cannot be identified"
  }
}

Unsupported Language Response

This response is given when the requested language is unsupported by the method
{
  "success": false,
  "response_code": 400,
  "error": {
    "error_code": 23,
    "type": "CLIENT",
    "message": "The given language 'nr' is not supported"
  }
}