|
MultiAgentDecisionProcess
|
MDPPolicyIterationGPU implements policy iteration for MDPs via GPU. More...
#include <MDPPolicyIterationGPU.h>
Public Member Functions | |
| double | GetQ (Index time_step, Index sI, Index jaI) const |
| Get Q-value for finite-horizon case. More... | |
| double | GetQ (Index sI, Index jaI) const |
| Get Q-value for infinite-horizon case. More... | |
| QTable | GetQTable (Index time_step) const |
| QTables | GetQTables () const |
| MDPPolicyIterationGPU () | |
| (default) Constructor More... | |
| MDPPolicyIterationGPU (const PlanningUnitDecPOMDPDiscrete &pu) | |
| void | Plan () |
| void | PlanSlow () |
| Uses the GetTransitionProbability() interface, which is slow. More... | |
| void | PlanWithCache (bool computeIfNotCached=true) |
| void | PlanWithCache (const std::string &filenameCache, bool computeIfNotCached=true) |
| void | SetQTable (const QTable &Q, Index time_step) |
| void | SetQTables (const QTables &Qs) |
| ~MDPPolicyIterationGPU () | |
| Destructor. More... | |
Public Member Functions inherited from MDPSolver | |
| Index | GetMaximizingAction (Index time_step, Index sI) |
| const PlanningUnitDecPOMDPDiscrete * | GetPU () const |
| Returns a ref to the PlanningUnit. More... | |
| virtual double | GetQ (Index time_step, const JointBeliefInterface &jb, Index jaI) const |
| virtual double | GetQ (const JointBeliefInterface &jb, Index jaI) const |
| void | LoadQTable (const std::string &filename, QTable &Q) |
| void | LoadQTables (const std::string &filename, int nrTables, QTables &Qs) |
| MDPSolver () | |
| (default) Constructor More... | |
| MDPSolver (const PlanningUnitDecPOMDPDiscrete &pu) | |
| void | Print () const |
| void | SetPU (const PlanningUnitDecPOMDPDiscrete &pu) |
| virtual | ~MDPSolver () |
| Destructor. More... | |
Public Member Functions inherited from TimedAlgorithm | |
| void | AddTimedEvent (const std::string &id, clock_t duration) |
| Adds event of certain duration, e.g., an external program call. More... | |
| std::vector< double > | GetTimedEventDurations (const std::string &id) |
| Returns all stored durations (in s) for a particular event. More... | |
| void | LoadTimers (const std::string &filename) |
| Load timing info from file filename. More... | |
| void | PrintTimers () const |
| Print stored timing info. More... | |
| void | PrintTimersSummary () const |
| Sums data and prints out a summary. More... | |
| void | SaveTimers (const std::string &filename) const |
| Save collected timing info to file filename. More... | |
| void | SaveTimers (std::ofstream &of) const |
| Save collected timing info to ofstream of. More... | |
| void | StartTimer (const std::string &id) const |
| Start to time an event identified by id. More... | |
| void | StopTimer (const std::string &id) const |
| Stop to time an event identified by id. More... | |
| TimedAlgorithm () | |
| (default) Constructor More... | |
| virtual | ~TimedAlgorithm () |
| Destructor. More... | |
Private Member Functions | |
| void | Initialize () |
| template<class M > | |
| void | Plan (std::vector< const M * > T) |
| Vector<const M*> T is the vector of matrices specifying the transition model (one matrix for each joint action). More... | |
Private Attributes | |
| bool | _m_finiteHorizon |
| Are we solving a finite-horizon problem? More... | |
| bool | _m_initialized |
| Is the MDPValueIteration object initialized?. More... | |
| QTables | _m_QValues |
| _m_QValues represents the non-stationary MDP Q function. More... | |
MDPPolicyIterationGPU implements policy iteration for MDPs via GPU.
|
inline |
(default) Constructor
| MDPPolicyIterationGPU::MDPPolicyIterationGPU | ( | const PlanningUnitDecPOMDPDiscrete & | pu | ) |
References _m_initialized.
| MDPPolicyIterationGPU::~MDPPolicyIterationGPU | ( | ) |
Destructor.
Get Q-value for finite-horizon case.
Implements MDPSolver.
Get Q-value for infinite-horizon case.
Implements MDPSolver.
Implements MDPSolver.
References _m_QValues.
|
virtual |
Implements MDPSolver.
References _m_QValues.
|
private |
References _m_finiteHorizon, _m_initialized, _m_QValues, PlanningUnit::GetHorizon(), PlanningUnitMADPDiscrete::GetNrJointActions(), PlanningUnitMADPDiscrete::GetNrStates(), MDPSolver::GetPU(), Globals::MAXHORIZON, TimedAlgorithm::StartTimer(), and TimedAlgorithm::StopTimer().
Referenced by PlanSlow().
|
private |
Vector<const M*> T is the vector of matrices specifying the transition model (one matrix for each joint action).
|
virtual |
Implements MDPSolver.
References PlanSlow().
| void MDPPolicyIterationGPU::PlanSlow | ( | ) |
Uses the GetTransitionProbability() interface, which is slow.
Duplication of code from the templatized version, but well...
References _m_finiteHorizon, _m_initialized, PlanningUnitDecPOMDPDiscrete::GetDiscount(), PlanningUnit::GetHorizon(), PlanningUnitMADPDiscrete::GetNrJointActions(), PlanningUnitMADPDiscrete::GetNrStates(), MDPSolver::GetPU(), PlanningUnitDecPOMDPDiscrete::GetReward(), SystemOfLinearEquationsSolver::getSolution(), PlanningUnitMADPDiscrete::GetTransitionProbability(), Initialize(), TimedAlgorithm::PrintTimersSummary(), SystemOfLinearEquationsSolver::solveSystemOfLinearEquations(), TimedAlgorithm::StartTimer(), and TimedAlgorithm::StopTimer().
Referenced by Plan().
|
virtual |
Implements MDPSolver.
|
virtual |
Implements MDPSolver.
Implements MDPSolver.
References _m_QValues.
|
virtual |
Implements MDPSolver.
References _m_QValues.
|
private |
Are we solving a finite-horizon problem?
Referenced by Initialize(), and PlanSlow().
|
private |
Is the MDPValueIteration object initialized?.
Referenced by Initialize(), MDPPolicyIterationGPU(), and PlanSlow().
|
private |
_m_QValues represents the non-stationary MDP Q function.
I.e. _m_QValues[t][sI][jaI] gives the expected reward at time-step t (time-to-go = horizon - t).
Referenced by GetQTable(), GetQTables(), Initialize(), SetQTable(), and SetQTables().