
Data Pipeline
*************


boto.datapipeline
=================


boto.datapipeline.layer1
========================

class class boto.datapipeline.layer1.DataPipelineConnection(**kwargs)

   This is the AWS Data Pipeline API Reference. This guide provides
   descriptions and samples of the AWS Data Pipeline API.

   APIVersion = '2012-10-29'

   DefaultRegionEndpoint = 'datapipeline.us-east-1.amazonaws.com'

   DefaultRegionName = 'us-east-1'

   ResponseError

      alias of "JSONResponseError"

   ServiceName = 'DataPipeline'

   activate_pipeline(pipeline_id)

      Validates a pipeline and initiates processing. If the pipeline
      does not pass validation, activation fails.

      Parameters:
         **pipeline_id** (*string*) -- The identifier of the pipeline
         to activate.

   create_pipeline(name, unique_id, description=None)

      Creates a new empty pipeline. When this action succeeds, you can
      then use the PutPipelineDefinition action to populate the
      pipeline.

      Parameters:
         * **name** (*string*) -- The name of the new pipeline. You
           can use the same name for multiple pipelines associated
           with your AWS account, because AWS Data Pipeline assigns
           each new pipeline a unique pipeline identifier.

         * **unique_id** (*string*) -- A unique identifier that you
           specify. This identifier is not the same as the pipeline
           identifier assigned by AWS Data Pipeline. You are
           responsible for defining the format and ensuring the
           uniqueness of this identifier. You use this parameter to
           ensure idempotency during repeated calls to CreatePipeline.
           For example, if the first call to CreatePipeline does not
           return a clear success, you can pass in the same unique
           identifier and pipeline name combination on a subsequent
           call to CreatePipeline. CreatePipeline ensures that if a
           pipeline already exists with the same name and unique
           identifier, a new pipeline will not be created. Instead,
           you'll receive the pipeline identifier from the previous
           attempt. The uniqueness of the name and unique identifier
           combination is scoped to the AWS account or IAM user
           credentials.

         * **description** (*string*) -- The description of the new
           pipeline.

   delete_pipeline(pipeline_id)

      Permanently deletes a pipeline, its pipeline definition and its
      run history. You cannot query or restore a deleted pipeline. AWS
      Data Pipeline will attempt to cancel instances associated with
      the pipeline that are currently being processed by task runners.
      Deleting a pipeline cannot be undone.

      Parameters:
         **pipeline_id** (*string*) -- The identifier of the pipeline
         to be deleted.

   describe_objects(object_ids, pipeline_id, marker=None, evaluate_expressions=None)

      Returns the object definitions for a set of objects associated
      with the pipeline. Object definitions are composed of a set of
      fields that define the properties of the object.

      Parameters:
         * **object_ids** (*list*) -- Identifiers of the pipeline
           objects that contain the definitions to be described. You
           can pass as many as 25 identifiers in a single call to
           DescribeObjects

         * **marker** (*string*) -- The starting point for the results
           to be returned. The first time you call DescribeObjects,
           this value should be empty. As long as the action returns
           HasMoreResults as True, you can call DescribeObjects again
           and pass the marker value from the response to retrieve the
           next set of results.

         * **pipeline_id** (*string*) -- Identifier of the pipeline
           that contains the object definitions.

         * **evaluate_expressions** (*boolean*) --

   describe_pipelines(pipeline_ids)

      Retrieve metadata about one or more pipelines. The information
      retrieved includes the name of the pipeline, the pipeline
      identifier, its current state, and the user account that owns
      the pipeline. Using account credentials, you can retrieve
      metadata about pipelines that you or your IAM users have
      created. If you are using an IAM user account, you can retrieve
      metadata about only those pipelines you have read permission
      for.

      Parameters:
         **pipeline_ids** (*list*) -- Identifiers of the pipelines to
         describe. You can pass as many as 25 identifiers in a single
         call to DescribePipelines. You can obtain pipeline
         identifiers by calling ListPipelines.

   evaluate_expression(pipeline_id, expression, object_id)

      Evaluates a string in the context of a specified object. A task
      runner can use this action to evaluate SQL queries stored in
      Amazon S3.

      Parameters:
         * **pipeline_id** (*string*) -- The identifier of the
           pipeline.

         * **expression** (*string*) -- The expression to evaluate.

         * **object_id** (*string*) -- The identifier of the object.

   get_pipeline_definition(pipeline_id, version=None)

      Returns the definition of the specified pipeline. You can call
      GetPipelineDefinition to retrieve the pipeline definition you
      provided using PutPipelineDefinition.

      Parameters:
         * **pipeline_id** (*string*) -- The identifier of the
           pipeline.

         * **version** (*string*) -- The version of the pipeline
           definition to retrieve.

   list_pipelines(marker=None)

      Returns a list of pipeline identifiers for all active pipelines.
      Identifiers are returned only for pipelines you have permission
      to access.

      Parameters:
         **marker** (*string*) -- The starting point for the results
         to be returned. The first time you call ListPipelines, this
         value should be empty. As long as the action returns
         HasMoreResults as True, you can call ListPipelines again and
         pass the marker value from the response to retrieve the next
         set of results.

   make_request(action, body)

   poll_for_task(worker_group, hostname=None, instance_identity=None)

      Task runners call this action to receive a task to perform from
      AWS Data Pipeline. The task runner specifies which tasks it can
      perform by setting a value for the workerGroup parameter of the
      PollForTask call. The task returned by PollForTask may come from
      any of the pipelines that match the workerGroup value passed in
      by the task runner and that was launched using the IAM user
      credentials specified by the task runner.

      Parameters:
         * **worker_group** (*string*) -- Indicates the type of task
           the task runner is configured to accept and process. The
           worker group is set as a field on objects in the pipeline
           when they are created. You can only specify a single value
           for workerGroup in the call to PollForTask. There are no
           wildcard values permitted in workerGroup, the string must
           be an exact, case-sensitive, match.

         * **hostname** (*string*) -- The public DNS name of the
           calling task runner.

         * **instance_identity** (*structure*) -- Identity information
           for the Amazon EC2 instance that is hosting the task
           runner. You can get this value by calling the URI,
           http://169.254.169.254/latest/meta-data/instance- id, from
           the EC2 instance. For more information, go to Instance
           Metadata in the Amazon Elastic Compute Cloud User Guide.
           Passing in this value proves that your task runner is
           running on an EC2 instance, and ensures the proper AWS Data
           Pipeline service charges are applied to your pipeline.

   put_pipeline_definition(pipeline_objects, pipeline_id)

      Adds tasks, schedules, and preconditions that control the
      behavior of the pipeline. You can use PutPipelineDefinition to
      populate a new pipeline or to update an existing pipeline that
      has not yet been activated.

      Parameters:
         * **pipeline_objects** (*list*) -- The objects that define
           the pipeline. These will overwrite the existing pipeline
           definition.

         * **pipeline_id** (*string*) -- The identifier of the
           pipeline to be configured.

   query_objects(pipeline_id, sphere, marker=None, query=None, limit=None)

      Queries a pipeline for the names of objects that match a
      specified set of conditions.

      Parameters:
         * **marker** (*string*) -- The starting point for the results
           to be returned. The first time you call QueryObjects, this
           value should be empty. As long as the action returns
           HasMoreResults as True, you can call QueryObjects again and
           pass the marker value from the response to retrieve the
           next set of results.

         * **query** (*structure*) -- Query that defines the objects
           to be returned. The Query object can contain a maximum of
           ten selectors. The conditions in the query are limited to
           top-level String fields in the object. These filters can be
           applied to components, instances, and attempts.

         * **pipeline_id** (*string*) -- Identifier of the pipeline to
           be queried for object names.

         * **limit** (*integer*) -- Specifies the maximum number of
           object names that QueryObjects will return in a single
           call. The default value is 100.

         * **sphere** (*string*) -- Specifies whether the query
           applies to components or instances. Allowable values:
           COMPONENT, INSTANCE, ATTEMPT.

   report_task_progress(task_id)

      Updates the AWS Data Pipeline service on the progress of the
      calling task runner. When the task runner is assigned a task, it
      should call ReportTaskProgress to acknowledge that it has the
      task within 2 minutes. If the web service does not recieve this
      acknowledgement within the 2 minute window, it will assign the
      task in a subsequent PollForTask call. After this initial
      acknowledgement, the task runner only needs to report progress
      every 15 minutes to maintain its ownership of the task. You can
      change this reporting time from 15 minutes by specifying a
      reportProgressTimeout field in your pipeline. If a task runner
      does not report its status after 5 minutes, AWS Data Pipeline
      will assume that the task runner is unable to process the task
      and will reassign the task in a subsequent response to
      PollForTask. task runners should call ReportTaskProgress every
      60 seconds.

      Parameters:
         **task_id** (*string*) -- Identifier of the task assigned to
         the task runner. This value is provided in the TaskObject
         that the service returns with the response for the
         PollForTask action.

   report_task_runner_heartbeat(taskrunner_id, worker_group=None, hostname=None)

      Task runners call ReportTaskRunnerHeartbeat to indicate that
      they are operational. In the case of AWS Data Pipeline Task
      Runner launched on a resource managed by AWS Data Pipeline, the
      web service can use this call to detect when the task runner
      application has failed and restart a new instance.

      Parameters:
         * **worker_group** (*string*) -- Indicates the type of task
           the task runner is configured to accept and process. The
           worker group is set as a field on objects in the pipeline
           when they are created. You can only specify a single value
           for workerGroup in the call to ReportTaskRunnerHeartbeat.
           There are no wildcard values permitted in workerGroup, the
           string must be an exact, case-sensitive, match.

         * **hostname** (*string*) -- The public DNS name of the
           calling task runner.

         * **taskrunner_id** (*string*) -- The identifier of the task
           runner. This value should be unique across your AWS
           account. In the case of AWS Data Pipeline Task Runner
           launched on a resource managed by AWS Data Pipeline, the
           web service provides a unique identifier when it launches
           the application. If you have written a custom task runner,
           you should assign a unique identifier for the task runner.

   set_status(object_ids, status, pipeline_id)

      Requests that the status of an array of physical or logical
      pipeline objects be updated in the pipeline. This update may not
      occur immediately, but is eventually consistent. The status that
      can be set depends on the type of object.

      Parameters:
         * **object_ids** (*list*) -- Identifies an array of objects.
           The corresponding objects can be either physical or
           components, but not a mix of both types.

         * **status** (*string*) -- Specifies the status to be set on
           all the objects in objectIds. For components, this can be
           either PAUSE or RESUME. For instances, this can be either
           CANCEL, RERUN, or MARK_FINISHED.

         * **pipeline_id** (*string*) -- Identifies the pipeline that
           contains the objects.

   set_task_status(task_id, task_status, error_code=None, error_message=None, error_stack_trace=None)

      Notifies AWS Data Pipeline that a task is completed and provides
      information about the final status. The task runner calls this
      action regardless of whether the task was sucessful. The task
      runner does not need to call SetTaskStatus for tasks that are
      canceled by the web service during a call to ReportTaskProgress.

      Parameters:
         * **error_code** (*integer*) -- If an error occurred during
           the task, specifies a numerical value that represents the
           error. This value is set on the physical attempt object. It
           is used to display error information to the user. The web
           service does not parse this value.

         * **error_message** (*string*) -- If an error occurred during
           the task, specifies a text description of the error. This
           value is set on the physical attempt object. It is used to
           display error information to the user. The web service does
           not parse this value.

         * **error_stack_trace** (*string*) -- If an error occurred
           during the task, specifies the stack trace associated with
           the error. This value is set on the physical attempt
           object. It is used to display error information to the
           user. The web service does not parse this value.

         * **task_id** (*string*) -- Identifies the task assigned to
           the task runner. This value is set in the TaskObject that
           is returned by the PollForTask action.

         * **task_status** (*string*) -- If FINISHED, the task
           successfully completed. If FAILED the task ended
           unsuccessfully. The FALSE value is used by preconditions.

   validate_pipeline_definition(pipeline_objects, pipeline_id)

      Tests the pipeline definition with a set of validation checks to
      ensure that it is well formed and can run without error.

      Parameters:
         * **pipeline_objects** (*list*) -- A list of objects that
           define the pipeline changes to validate against the
           pipeline.

         * **pipeline_id** (*string*) -- Identifies the pipeline whose
           definition is to be validated.


boto.datapipeline.exceptions
============================

exception exception boto.datapipeline.exceptions.InternalServiceError(status, reason, body=None, *args)

exception exception boto.datapipeline.exceptions.InvalidRequestException(status, reason, body=None, *args)

exception exception boto.datapipeline.exceptions.PipelineDeletedException(status, reason, body=None, *args)

exception exception boto.datapipeline.exceptions.PipelineNotFoundException(status, reason, body=None, *args)

exception exception boto.datapipeline.exceptions.TaskNotFoundException(status, reason, body=None, *args)
