ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.06368
26
30

Policy Optimization Using Semi-parametric Models for Dynamic Pricing

13 September 2021
Jianqing Fan
Yongyi Guo
Mengxin Yu
ArXivPDFHTML
Abstract

In this paper, we study the contextual dynamic pricing problem where the market value of a product is linear in its observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar to Javanmard and Nazerzadeh [2019] except that we expand the demand curve to a semiparametric model and need to learn dynamically both parametric and nonparametric components. We propose a dynamic statistical learning and decision-making policy that combines semiparametric estimation from a generalized linear model with an unknown link and online decision-making to minimize regret (maximize revenue). Under mild conditions, we show that for a market noise c.d.f. F(⋅)F(\cdot)F(⋅) with mmm-th order derivative (m≥2m\geq 2m≥2), our policy achieves a regret upper bound of O~d(T2m+14m−1)\tilde{O}_{d}(T^{\frac{2m+1}{4m-1}})O~d​(T4m−12m+1​), where TTT is time horizon and O~d\tilde{O}_{d}O~d​ is the order that hides logarithmic terms and the dimensionality of feature ddd. The upper bound is further reduced to O~d(T)\tilde{O}_{d}(\sqrt{T})O~d​(T​) if FFF is super smooth whose Fourier transform decays exponentially. In terms of dependence on the horizon TTT, these upper bounds are close to Ω(T)\Omega(\sqrt{T})Ω(T​), the lower bound where FFF belongs to a parametric class. We further generalize these results to the case with dynamically dependent product features under the strong mixing condition.

View on arXiv
Comments on this paper