Text Preprocessing

Поділитися
Вставка
  • Опубліковано 19 гру 2024

КОМЕНТАРІ • 12

  • @khairulikhwanazman6094
    @khairulikhwanazman6094 2 місяці тому +1

    I need to save the clean and preprocessed word from orange in excel but whenever I save data it just revert back to the original data

  • @eylmaz6696
    @eylmaz6696 7 місяців тому

    does orange have Cumulatif distribuation function and probability distribution function to get out the results ?

    • @OrangeDataMining
      @OrangeDataMining  7 місяців тому

      No sure what you wish to achieve, but these options are available in the Distributions widget.

    • @eylmaz6696
      @eylmaz6696 7 місяців тому

      @@OrangeDataMining for clustering on kmeans algorithm ? which one is important silhouıette skor or centering on the intersection is important by checkng scatter polot?

    • @eylmaz6696
      @eylmaz6696 7 місяців тому

      @@OrangeDataMining for kmeans clustering ; how can i make a comment for result ? for instance, relation between ; I have anxeity, I dont have anxiety ; I sleep much ; I dont sleep much. When I cluster them, will i comment it by using the silhouette score maximum ?

    • @OrangeDataMining
      @OrangeDataMining  7 місяців тому

      @@eylmaz6696 Apologies, I don't quite understand the question.

    • @eylmaz6696
      @eylmaz6696 7 місяців тому

      @@OrangeDataMining do you have support mail or phone ? can i get one question

  • @gabrielapinto5306
    @gabrielapinto5306 7 місяців тому

    I am finding it difficult to adapt all that to tweets written in Portuguese. Does orange have a solution?

    • @OrangeDataMining
      @OrangeDataMining  7 місяців тому

      Yes. The tokenizer remains the same. Stopwords are available for Portuguese, too. Same for lemmatization (only UDPipe). SBERT and FastText also support pt. In summary, most language specific methods support Portuguese (some also pt-br), others are language independent.

  • @neilirvine7129
    @neilirvine7129 8 місяців тому

    Love it!

  • @nadiamaelaniulfah1100
    @nadiamaelaniulfah1100 4 місяці тому

    does orange not support arabic? orange told "no text found" when i'm uploading my arabic corpus. any solution for this?🥲

    • @OrangeDataMining
      @OrangeDataMining  4 місяці тому

      Orange supports Arabic to some extent. There is Arabic lemmatizer with UDPipe, stopwords from NLTK, and embedders in Document Embedding. Your error tells you you are likely missing a text variable in your data. Please head to our discussions board (github.com/biolab/orange3-text/discussions) where we pinpoint your problem.