Triggering Tableau extract refreshes using the REST API

Поділитися
Вставка
  • Опубліковано 7 бер 2021
  • In this Tableau Server REST API tutorial we demonstrate how you can trigger extract refreshes for Tableau workbooks and datasources programmatically.
    This capability is especially valuable for teams who want extracts to be updated when the underlying data is ready to be refreshed, instead of setting extract refreshes on a timed schedule.
    Is the data ready to be updated at 6:51 AM? This shows you how to trigger that extract now instead of waiting until the schedule runs at 10 AM.
    Medium tutorial article on triggering extract refreshes:
    / triggering-extract-ref...
    For more written tutorials, check out the Medium blog posts!
    / elliottstam​
    Python version used: 3.8
    Coding environment: Jupyter Lab
    Tableau Server REST API endpoint focus:
    Update Workbook Now
    Update Data Source Now
    Query Job
    To make sure you have all the latest features, update tableau-api-lib:
    pip install -U tableau-api-lib
    Getting started with tableau-api-lib: • tableau-api-lib (blitz...
    Join the Tableau Developer Program to get involved. It coms with a free Tableau Online developer site!
    www.tableau.com/developer

КОМЕНТАРІ • 70

  • @Tycho1987
    @Tycho1987 3 роки тому +1

    Thanks for sharing, I like your style! Keep it up!

    • @devyx
      @devyx  3 роки тому

      Much appreciated!

  • @pavelnacev
    @pavelnacev 3 роки тому +1

    Great video, thanks for sharing!

    • @devyx
      @devyx  3 роки тому

      Appreciate the feedback, Pavel! Glad you liked it.

  • @yifanj4648
    @yifanj4648 2 місяці тому

    Which part makes the Tableau dashboard automatically refresh as soon as data source refresh completes? Sorry I didn't catch that.

  • @pjosifovic
    @pjosifovic 3 місяці тому

    Sorry for my lack of knowledge but can you trigger a refresh extract based on Javascript event? I have a use case where a user would like to press a button on our web platform and run a extract refresh with a scheduled subscription.

  • @pawanjadhav9139
    @pawanjadhav9139 2 роки тому

    This is really helpful thanks for sharing!! In addition to this i have one scenario where extract refresh should trigger automatically after database load run completes. Can you please share any journal or readable stuff on it.

    • @devyx
      @devyx  2 роки тому

      The shortest point to make is that you can trigger an extract refresh at any time using Python code, and it's your decision how you make use of that fact in your workflows.
      Personally I like the idea of setting up something like an Airflow DAG where one task would be your database ETL job, and then downstream from that task would be the extract refresh task.
      If your ETL stage fails, its downstream task (extract refresh) does not run. You can also integrate smart notifications such as a Slack message or an email notification that the ETL failed.
      Airflow is a great orchestration tool for data teams who are good with Python.

  • @yuvakarthiking
    @yuvakarthiking 2 роки тому

    Hello , thanks for the video. I am trying to do the same. But getting error like name or service not known .

  • @alliejackson1033
    @alliejackson1033 2 роки тому

    Thanks fort the video! Is this relevant for Tableau Public as well? Or just Tableau Server? Thank you!

    • @devyx
      @devyx  2 роки тому

      You can use the REST API with Tableau Server and Tableau Online. You can think of Tableau Server like owning a house, Tableau Online like renting an apartment, and Tableau Public like hanging out in a public park. You can use the REST API when it's your environment to do with as you wish, but when you're in the public park you don't have the rights to dig up that tree or move that fountain.
      Long way of saying the REST API doesn't work on Tableau Public, and hopefully that helps paint the picture of why. It's like one giant Tableau Online environment that's more limited than if you have your own workspace.

    • @alliejackson1033
      @alliejackson1033 2 роки тому

      @@devyx Haha - thanks for the analogy. And thanks for the reply! Very helpful.

  • @timothyakers4041
    @timothyakers4041 2 роки тому

    I'm still relatively new to both Python and Tableau, and while I understand this completely and am able to replicate it I had a question regarding what steps I should take exactly. When you are updating a workbook are you simply updating the workbook with data that comes in from the data source? If, for example, your workbook was not updated with newer data would that required you to run both an update for the data source and the workbook itself?

    • @devyx
      @devyx  2 роки тому +1

      Hey Timothy, welcome to both Python and Tableau! Let me take a stab at answering your questions.
      TL;DR
      In general when people say they are "updating a workbook", what they really mean is that they are updating (refreshing) the data populating a workbook. You can of course literally update a workbook by editing aspects of the workbook such as the worksheets, dashboards, calculated fields, etc within the workbook and saving the changes, but I'd encourage you to separate that sort of change from the concept of updating/refreshing the data that feeds into the workbook.
      Want to know more? Read on!
      In the Tableau world, there are some file types that are good to know about:
      1) workbook files (.twb)
      2) packaged workbook files (.twbx)
      3) datasource files (.tds)
      4) packaged datasource files (.tdsx)
      The .twb and .tds files are really just XML filled with metadata about visuals and formulas (.twb files) or metadata about where Tableau can find data (.tds files). These files never hold any of the underlying data within them; they simply describe what to do with the data and how to connect to it.
      Because the .twb and .tds files don't hold any of the underlying data, they are said to have a "live" connection. With live datasources, there is no need to "update" a workbook because the data is always connected directly to the source (database, extract file, CSV file, or whatever).
      However, you can also take snapshots of data from the underlying source. For example, I have a dashboard that connects to Google BigQuery. It's my decision to maintain a live connection to that database, but I could create an extract of how that data looks right now and store that locally on my computer or publish it to Tableau Server.
      The .twb and .tds files you don't contain extracts -- so if I begin using extracts, I can combine my XML files (.twb, .tds) with the extracts as a zip file (.twbx, .tdsx) and then the extract is packaged along with the metadata describing how to make use of the data in that extract.
      So to sum it all up, let's say:
      1) I have a .twb file that has a live connection to Google BigQuery.
      2) I have a .twbx file that has an extract whose data was sourced from my Google BigQuery table.
      In scenario (1) I don't "update" my workbook because the data is always live and fresh as can be.
      In scenario (2) I can "update" my workbook, which is synonymous with triggering any of the extracts feeding the workbook.
      It's worth pointing out that refreshing extracts and updating a workbook are not technically the same thing. If you update a workbook, you are triggering all of the extracts that fuel the workbook. So updating a workbook can lead to refreshing multiple extracts, whereas refreshing an extract directly will only refresh that one extract being targeted.
      Hope that helps!

  • @user-tf9nw2oq2i
    @user-tf9nw2oq2i 7 місяців тому

    will the above example be helpful for a unpublished worbook?

  • @KathyLoisAmores
    @KathyLoisAmores 2 роки тому

    Hi just want to confirm something. Does workbook extracts mean the backend tables used in the workbook? Can I also do a scheduled refresh? e.g. once a month?

    • @devyx
      @devyx  2 роки тому

      Yes you can! This manual triggering discussed in the video is more applicable to situations where you want to programmatically trigger the data refreshes. However, if you simply want to refresh the data on a set schedule then you can do so without this custom code. I would recommend using the Tableau user interface for that rather than the REST API -- no code required.

  • @AkashKumar-it1mz
    @AkashKumar-it1mz 2 роки тому

    Hi Devyx big fan of your channel man thanks for your videos. I wanted to know how we can run refresh Failed extracts In Tableau using rest API

    • @devyx
      @devyx  2 роки тому +2

      Hey, the REST API gives you various flexible endpoints to accomplish a variety of use cases. For taking care of failed extracts, you could (for example) create a webhook that watches your extract refresh task and does X upon failure. That action, X, could involve hitting the endpoint to trigger a new attempt to refresh your extract. The API reference documents all the relevant endpoints and the tableau-api-lib is available if you don't want to code frkm scratch to make uuse of those endpoints.

    • @AkashKumar-it1mz
      @AkashKumar-it1mz 2 роки тому

      @@devyx Can you please make a video on it as I see videos related to webhooks and these kind of automation very rarely

  • @alexxu9856
    @alexxu9856 3 роки тому

    Thanks for sharing. I have a question regarding refreshing extract from tableau online. we currently use bridge client to manage all the refresh schedules to connect to our sql server. i was trying to use your method above to do the refresh. however, we found that using above method, the refresh job fails all the time because it can't find our database. is there any way we can somehow link to our bridge client then call the refresh function so it can find our source database?

    • @devyx
      @devyx  3 роки тому

      Hey Alex, can you add in some more information about what error happens for you? Knowing the exact response codes / error messages and the inputs you passed to the server (not including your credentials of course) are needed to understand the underlying problem.
      I have also encountered problems before where the server response says something like "can't connect to the datasource" and in some situations this has been due to silly things like port numbers that aren't actually relevant needing to be set to "0", etc.
      If you can give me an idea of what your inputs to the server look like (again, excluding your user / password / exact server address) then I might have some ideas.

    • @alexxu9856
      @alexxu9856 3 роки тому

      @@devyx Thank you so much for responding so quickly. do you have email I can share the code to you? thanks

    • @devyx
      @devyx  3 роки тому

      @@alexxu9856 even better, join the Tableau DataDev Slack channel and you'll have an entire community to help you with future questions: tabsoft.co/JoinTableauDev

    • @devyx
      @devyx  3 роки тому

      You can find me in there, feel free to DM me those specifics of the error.

  • @umarfaridi6230
    @umarfaridi6230 2 роки тому

    Hi, would it be possible to run extract refresh on 2 different sites. Currently I have set up the program to log into 1 site using token - would it be possible with the same code - to log into another site and refresh those sites. Currently the code reads a table set up in SQL - and checks for extracts whose flag value is "y" and then logs into Site A and refreshes the extracts. Now if we add other extracts to the same table with a site name - would this be possible

    • @devyx
      @devyx  2 роки тому

      You can refresh an extract on any site, so long as you have the appropriate credentials.
      If you are a server administrator, you can use your credentials to log into any site on the server. If you are already signed into one site as a server admin, you can use the switch_site() method to switch to another site on that server.
      If you are not a server admin, you will need to become one or otherwise have a user account (each having its own set of credentials / personal access tokens) provisioned on each site. If you are not a server admin, you will need to run the same script for each site using the appropriate user account existing on each site.

  • @tingliang6584
    @tingliang6584 Рік тому

    Is there any way I can automate this to run the code for the extract refresh every 5, 10 or 15 minutes?

    • @devyx
      @devyx  Рік тому

      Yes, and it's up to you on how you build that workflow. The REST API enables triggering extracts. How, when, and why those extracts are triggered is entirely your decision.
      As far as workflow orchestration goes, I like Airflow. But there are almost endless tools out there for you to automate the execution of some Python code on a scheduled interval.

  • @predatorbako
    @predatorbako 2 роки тому

    Hello, good stuff here. I know this is a bit late but what do you think about using TabPy to run this python script? The use case is that we want to allow users who aren't owners or admins to refresh extracts. The idea is that we can have a button on that report that will call a Tabpy deployed function that will refresh the extract for that workbook. It seems simpler than a javascript solution.

    • @devyx
      @devyx  2 роки тому +1

      My opinion is that if it works well for you and your situation, then go for it.

    • @predatorbako
      @predatorbako 2 роки тому

      @@devyx Thanks for the quick reply. I'll investigate it further.

    • @wonseoryu3323
      @wonseoryu3323 Рік тому +1

      @@predatorbako Hello! Did your idea succeed?

    • @predatorbako
      @predatorbako Рік тому

      @@wonseoryu3323 I actually haven't implemented it yet but it's on my list of things to try lol

  • @dobrinstoilov
    @dobrinstoilov Рік тому

    Is this only for tableau server, or it is equally applicable to tableau online ?

    • @devyx
      @devyx  Рік тому

      Hey Dobrin, I work primarily with Tableau Server and do not have any extracts to test this on Tableau Online. The library supports the REST API, which is usable for both Tableau Server and Tableau Online, however I do not know 100% which endpoints are supported in the Tableau Online environment. All I know is that full functionality does not exist for Tableau Online compared to Tableau Server.
      I recommend trying it out, and here is documentation on the relevant endpoint from Tabelau's REST API reference:
      help.tableau.com/current/api/rest_api/en-us/REST/rest_api_ref_jobs_tasks_and_schedules.htm#run_extract_refresh_task

  • @TheKarasuzero
    @TheKarasuzero 5 місяців тому

    Hi i would like to know if it is possible to use the script in a way to refresh only extract that failed. basically we have several datasources extracts fail every day due to connection failure so would like to identify these datasources that extract fail and rerun them automatically. how can i do that?

    • @devyx
      @devyx  5 місяців тому

      I recommend looking into webhooks, which are supported via Tableau's REST API. The idea is that if an extract refresh failure occurs, the webhook triggers and you are free to associate that triggering event with any arbitrary code. That includes triggering another attempt.

    • @TheKarasuzero
      @TheKarasuzero 5 місяців тому

      @@devyx hi thank you for your answer. If you have a link that details how to do it i will be more than happy . Merry Christmas as well

  • @fabersoaks4975
    @fabersoaks4975 2 роки тому

    Hi,
    is there a way to change extract to live connection in tableau server and tableau online by rest api + python?

    • @devyx
      @devyx  2 роки тому

      The REST API supports creating extracts for live connections, but off the top of my head I don't know of a REST API command to convert an extract to a live connection.
      You can certainly make the change in the Tableau Server UI.

    • @fabersoaks4975
      @fabersoaks4975 2 роки тому

      @@devyx but Can we modify datasource of existing live connection via tableau server client?

    • @devyx
      @devyx  2 роки тому

      @@fabersoaks4975 for Tableau Server Client, I don't know because I do not use it. I use a different implementation of Python + Tableau REST API (tableau-api-lib).
      Can you clarify what you mean by updating a datasource? You can update some aspects of the Tableau datasource via REST API but the REST API does not support changing the inner workings of the underlying connection, such as pointing to a different database.
      Everything I know on this topic I am reading from the Tableau REST API reference. I recommend using that as your primary source of what is possible via REST API.

    • @fabersoaks4975
      @fabersoaks4975 2 роки тому

      ​@@devyx
      I am trying to migrate workbook fromm tableau server to online where we need to migrate live data. For live data, the datasource of the live data migrated will be as of defined on source server by default, so, is their a way to change the datasource of that live data to the destination datasource by Python and Rest APi

    • @devyx
      @devyx  2 роки тому

      @@fabersoaks4975 If your goal is to change from an extract to live connection, you can do this via the Tableau Server/Online UI and I am not aware of an API endpoint that does it (could be missing something, though).
      If you are publishing a datasource from Tableau Server to Tableau Online, that is not something the REST API supports and it will require (a) modifying the .tds XML, or (b) perhaps Tableau's Content Migration Tool if it now supports migrating from TS to TO.

  • @yuvakarthiking
    @yuvakarthiking 2 роки тому

    Hello, I am trying to refresh following the same . But facing ‘name or service not known’ error

    • @devyx
      @devyx  2 роки тому

      If you post the exact error in its entirety then I might be able to point you in the right direction.

    • @yuvakarthiking
      @yuvakarthiking 2 роки тому

      @@devyx thanks for the response. Sharing the error
      [image]
      NewConnectionError: : Failed to establish a new connection: [Errno -2] Name or service not known

    • @yuvakarthiking
      @yuvakarthiking 2 роки тому

      @@devyx hello again . Tried running same in different workspace and now getting below error
      /local_disk0/.ephemeral_nfs/envs/pythonEnv-dfb37260-728f-462d-87e6-51ae4535ab45/lib/python3.8/site-packages/tableau_api_lib/decorators/verification.py:156: UserWarning: Warning: could not verify your Tableau Server's API version. If using a legacy version of Tableau Server, be sure to reference the legacy Tableau Server REST API documentation provided by Tableau. Some current API methods may exist that are not available on your legacy Tableau Server. warnings.warn( Out[1]:

    • @devyx
      @devyx  2 роки тому

      @@yuvakarthiking a Google search on your urllib3 error suggests it could be an issue with incorrectly specifying the connection details. I recommend verifying your config is defined correctly.

    • @devyx
      @devyx  2 роки тому

      @@yuvakarthiking the second error is an exception the library throws when unable to verify your Tableau API version. Combined with the knowledge of your first error, my guess is that your config is not quite correct or there are some networking issues preventing you from communicating with your server via REST API.
      Might be worth checking with your team if the REST API is enabled and if the server is reachable by internet traffic.

  • @aravindreddy2495
    @aravindreddy2495 Рік тому

    Hello @devyx, i am using the same way to invoke the extracts from Jupyter notebook,
    but some how from yesterday i started facing this error while using the get_datasources_dataframe() fuction. all the datasources are published as well.
    "PaginationError:
    The Tableau Server REST API method query_data_sources did not return paginated results.
    Please verify that your connection is logged in and has a valid auth token.
    If using personal access tokens, note that only one session can be active at a time using a single token.
    Also note that the extract_pages() method wrapping this call is intended for paginated results only.
    Not all Tableau Server REST API methods support pagination. "
    can you please let me know how can i overcome this error.?
    do i have to specify any limit parameter while configuring the connection.

    • @devyx
      @devyx  Рік тому

      Hey Aravind, can you try using the 'query_data_sources()' method and see if this returns any datasources to you?
      The code would like this, assuming your connection object is named "conn":
      response = conn.query_data_sources()
      print(response.json())
      Now we are printing out the JSON response of this request, without implementing any pagination. This is the method that the 'get_datasources_dataframe()' function uses underneath the hood.
      Do you see any datasource details in that response?

    • @aravindreddy2495
      @aravindreddy2495 Рік тому

      @@devyx thank you so much for the response.. i am able to see a lot of data sources inside the jason response.
      the response started with this line
      {'pagination': {'pageNumber': '1', 'pageSize': '100', 'totalAvailable': '174'}, 'datasources': {'datasource': [{'project':
      so does that mean this is a bug from tableau side.? or it would be really help full if there is any way to extact the data source id's....
      thanks in advance devyx...🙂

    • @devyx
      @devyx  Рік тому

      @@aravindreddy2495 Probably a bug in the 'get_datasources_dataframe' function then. I'll look into it!

    • @aravindreddy2495
      @aravindreddy2495 Рік тому

      @@devyx thank you so much 😄

    • @devyx
      @devyx  Рік тому +1

      @@aravindreddy2495 I couldn't reproduce the issue on my end -- can you share with me which version of tableau-api-lib and which version of Tableau Server you are using?

  • @madanadhikari5549
    @madanadhikari5549 2 роки тому

    Hello Dvyx,
    it was nice and very deep video indeed but i am new in python so can not debug the error, kindly help me
    after running this code - workbooks_df = querying.get_workbooks_dataframe(conn)
    getting below error
    PaginationError:
    The Tableau Server REST API method query_workbooks_for_site did not return paginated results.
    Please verify that your connection is logged in and has a valid auth token.
    If using personal access tokens, note that only one session can be active at a time using a single token.
    Also note that the extract_pages() method wrapping this call is intended for paginated results only.
    Not all Tableau Server REST API methods support pagination.

    • @devyx
      @devyx  2 роки тому

      If you are on the latest Tableau Server version, their devs seem to have introduced a bug that makes the _all_ fields parameter associated with the 'Query Workbooks for Site' endpoint error. Unfortunately this will not work for you until that bug is fixed on their end. In the meantime, you can use the tableau-api-lib method "query_workbooks_on_site" which taps directly into the underlying endpoint and does not make use of their _all_ fields param.
      Hope that helps!

    • @madanadhikari5549
      @madanadhikari5549 2 роки тому

      @@devyx
      Thanks for your reply, I have gone through the code for the package "tableau_api_lib" there is no function declaration for "query_workbooks_on_site". it is only inside readme.md
      if you can guide me toward that function that would be a great help.
      Thanks

    • @devyx
      @devyx  2 роки тому

      @@madanadhikari5549 It is "query_workbooks_for_site", sorry I made a typo earlier.
      The library simply implements the Tableau REST API endpoint, which is documented here:
      help.tableau.com/current/api/rest_api/en-us/REST/rest_api_ref_workbooks_and_views.htm#query_workbooks_for_site
      In the library, the code implementing this endpoint can be seen here: github.com/divinorum-webb/tableau-api-lib/blob/43ec086f4163c7a9c94003bec994b7c91c6caecb/src/tableau_api_lib/tableau_server_connection.py#L1473

    • @madanadhikari5549
      @madanadhikari5549 2 роки тому

      @@devyx Thanks Devyx
      while running this code
      workbooks_df = querying.query_workbooks_for_site(conn)
      i got this error please help how to debug as i am new in py
      AttributeError Traceback (most recent call last)
      in
      ----> 1 workbooks_df = querying.query_workbooks_for_site(conn)
      AttributeError: module 'tableau_api_lib.utils.querying' has no attribute 'query_workbooks_for_site'

  • @prasadraut9514
    @prasadraut9514 2 роки тому

    Please Answer If You Know about it
    whenever I do this
    response1=conn.query_job(refresh_datasource_job_id)
    print(response1.json())
    I'm getting this error
    {'error': {'summary': 'Bad Request', 'detail': "There was a problem querying job 'e7333099-e3fa-404c-9764-ded2e8780f13'.", 'code': '400031'}}

    • @devyx
      @devyx  2 роки тому

      Hi Prasad, a quick Google search turned up this Tableau forum post. Perhaps you'll find it helpful. For error codes related to API queries, you can also use the Tableau REST API reference to look up what their various codes mean. They have documentation for the "Query Job" endpoint there as well.
      community.tableau.com/s/question/0D54T00000bkpMn/rest-api-cant-query-job-based-on-job-id-returned-by-createextract-endpoint
      Query Job endpoint documentation:
      help.tableau.com/current/api/rest_api/en-us/REST/rest_api_ref_jobs_tasks_and_schedules.htm#query_job