Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOC: request device configuration and publish metrics/info #3037

Merged
merged 17 commits into from
Mar 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
5027b3e
handleconfig: move config URL definition into the getLatestConfig()
rouming Feb 8, 2023
45f6828
handleconfig: handle error of the inhaleDeviceConfig() call
rouming Feb 8, 2023
8611b62
zedagent: parse 'LOCConfig'
rouming Feb 12, 2023
525c857
handleconfig: request device configuration from the LOC
rouming Feb 12, 2023
e227d2e
handlemetrics: repeat send of the metrics info to the LOC
rouming Feb 13, 2023
01f06a7
zedagent,zedcloud: make deferred routines 'DeferredContext' dependent
rouming Feb 28, 2023
1bdc2f3
zedagent,zedcloud: introduce 'ignoreErr' flag for a deferred item cre…
rouming Feb 28, 2023
2ba30e4
zedcloud/deferred: introduce a KickTimer() function
rouming Mar 1, 2023
374ed65
zedagent,zedcloud: introduce periodic deferred context
rouming Feb 28, 2023
576993c
zedagent: create different deferred requests according to the destina…
rouming Feb 16, 2023
a629a6f
zedagent: make a couple of 'zedagentContext' members private
rouming Mar 1, 2023
9ddbd5c
zedagent: pass 'destinationBitset' type for all sort of "info" Publis…
rouming Feb 16, 2023
5f40052
zedagent: switch all functions which publish "info" to multiple URLs
rouming Feb 17, 2023
74a5671
zedagent: introduce 'forcePeriodic' flag for the queueInfoToDest() call
rouming Mar 1, 2023
393cc1a
zedagent: switch location publishing to the deferred requests
rouming Feb 17, 2023
1c4011e
zedagent: reuse metrics timer for publishing all info for the LOC
rouming Feb 16, 2023
91f617a
zedagent/handleconfig: remove unused member struct
rouming Mar 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions pkg/pillar/cmd/zedagent/handleappInstMetadata.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,14 @@ func handleAppInstMetaDataModify(ctxArg interface{}, key string,
func handleAppInstMetaDataDelete(ctxArg interface{}, key string, statusArg interface{}) {
appInstMetaData := statusArg.(types.AppInstMetaData)
ctx := ctxArg.(*zedagentContext)
PublishAppInstMetaDataToZedCloud(ctx, &appInstMetaData, true)
PublishAppInstMetaDataToZedCloud(ctx, &appInstMetaData, true, AllDest)
ctx.iteration++
}

func handleAppInstMetaDataImpl(ctxArg interface{}, key string, statusArg interface{}) {

appInstMetaData := statusArg.(types.AppInstMetaData)
ctx := ctxArg.(*zedagentContext)
PublishAppInstMetaDataToZedCloud(ctx, &appInstMetaData, false)
PublishAppInstMetaDataToZedCloud(ctx, &appInstMetaData, false, AllDest)
ctx.iteration++
}
4 changes: 2 additions & 2 deletions pkg/pillar/cmd/zedagent/handleblob.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@ func handleBlobStatusImpl(ctxArg interface{}, key string,
status := statusArg.(types.BlobStatus)
ctx := ctxArg.(*zedagentContext)
uuidStr := status.Key()
PublishBlobInfoToZedCloud(ctx, uuidStr, &status, ctx.iteration)
PublishBlobInfoToZedCloud(ctx, uuidStr, &status, ctx.iteration, AllDest)
ctx.iteration++
}

func handleBlobDelete(ctxArg interface{}, key string, statusArg interface{}) {
status := statusArg.(types.BlobStatus)
ctx := ctxArg.(*zedagentContext)
uuidStr := status.Key()
PublishBlobInfoToZedCloud(ctx, uuidStr, nil, ctx.iteration)
PublishBlobInfoToZedCloud(ctx, uuidStr, nil, ctx.iteration, AllDest)
ctx.iteration++
}
6 changes: 3 additions & 3 deletions pkg/pillar/cmd/zedagent/handlecertconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -428,9 +428,9 @@ func sendAttestReqProtobuf(attestReq *attest.ZAttestReq, iteration int) {
//We queue the message and then get the highest priority message to send.
//If there are no failures and defers we'll send this message,
//but if there is a queue we'll retry sending the highest priority message.
zedcloud.SetDeferred(zedcloudCtx, deferKey, buf, size, attestURL,
false, false, attestReq.ReqType)
zedcloud.HandleDeferred(zedcloudCtx, time.Now(), 0, true)
zedcloudCtx.DeferredEventCtx.SetDeferred(deferKey, buf, size, attestURL,
false, false, false, attestReq.ReqType)
zedcloudCtx.DeferredEventCtx.HandleDeferred(time.Now(), 0, true)
}

// initialize cipher pubsub trigger handlers and channels
Expand Down
90 changes: 66 additions & 24 deletions pkg/pillar/cmd/zedagent/handleconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ type getconfigContext struct {
localProfileTrigger chan Notify
localServerMap *localServerMap
lastDevCmdTimestamp uint64 // From lastDevCmdTimestampFile
locConfig *types.LOCConfig

// parsed L2 adapters
vlans []L2Adapter
Expand All @@ -137,8 +138,6 @@ type getconfigContext struct {
// This information is persisted under /persist/checkpoint/localcommands
localCommands *types.LocalCommands

callProcessLocalProfileServerChange bool //did we already call processLocalProfileServerChange

configRetryUpdateCounter uint32 // received from config

// Frequency in seconds at which metrics is published to the controller.
Expand Down Expand Up @@ -352,10 +351,9 @@ func initZedcloudContext(networkSendTimeout uint32, agentMetrics *zedcloud.Agent
// Run a periodic fetch of the config
func configTimerTask(getconfigCtx *getconfigContext, handleChannel chan interface{}) {
ctx := getconfigCtx.zedagentCtx
configUrl := zedcloud.URLPathString(serverNameAndPort, zedcloudCtx.V2API, devUUID, "config")
iteration := 0
withNetTracing := traceNextConfigReq(ctx)
retVal, tracedReqs := getLatestConfig(getconfigCtx, configUrl, iteration, withNetTracing)
retVal, tracedReqs := getLatestConfig(getconfigCtx, iteration, withNetTracing)
configProcessingSkipFlag := retVal == skipConfig
if configProcessingSkipFlag != getconfigCtx.configProcessingSkipFlag {
getconfigCtx.configProcessingSkipFlag = configProcessingSkipFlag
Expand Down Expand Up @@ -395,12 +393,9 @@ func configTimerTask(getconfigCtx *getconfigContext, handleChannel chan interfac
case <-ticker.C:
start := time.Now()
iteration += 1
// In case devUUID changed we re-generate
configUrl = zedcloud.URLPathString(serverNameAndPort,
zedcloudCtx.V2API, devUUID, "config")
withNetTracing = traceNextConfigReq(ctx)
retVal, tracedReqs = getLatestConfig(
getconfigCtx, configUrl, iteration, withNetTracing)
getconfigCtx, iteration, withNetTracing)
configProcessingSkipFlag = retVal == skipConfig
if configProcessingSkipFlag != getconfigCtx.configProcessingSkipFlag {
getconfigCtx.configProcessingSkipFlag = configProcessingSkipFlag
Expand Down Expand Up @@ -476,7 +471,7 @@ func updateCertTimer(configInterval uint32, tickerHandle interface{}) {
// until one succeeds in communicating with the cloud.
// We use the iteration argument to start at a different point each time.
// Returns a configProcessingSkipFlag
func getLatestConfig(getconfigCtx *getconfigContext, url string,
func requestConfigByURL(getconfigCtx *getconfigContext, url string,
iteration int, withNetTracing bool) (configProcessingRetval, []netdump.TracedNetRequest) {

log.Tracef("getLatestConfig(%s, %d)", url, iteration)
Expand All @@ -495,8 +490,7 @@ func getLatestConfig(getconfigCtx *getconfigContext, url string,
getconfigCtx.configGetStatus = types.ConfigGetFail
b, cr, err := generateConfigRequest(getconfigCtx)
if err != nil {
// XXX fatal?
return configReqFailed, nil
log.Fatal(err)
}
buf := bytes.NewBuffer(b)
size := int64(proto.Size(cr))
Expand Down Expand Up @@ -579,9 +573,17 @@ func getLatestConfig(getconfigCtx *getconfigContext, url string,
if config != nil {
log.Noticef("Using saved config dated %s",
ts.Format(time.RFC3339Nano))

cfgRetval := inhaleDeviceConfig(getconfigCtx, config, savedConfig)
if cfgRetval != configOK {
log.Errorf("inhaleDeviceConfig failed: %d", cfgRetval)
return cfgRetval, rv.TracedReqs
}
eriknordmark marked this conversation as resolved.
Show resolved Hide resolved

getconfigCtx.readSavedConfig = true
getconfigCtx.configGetStatus = types.ConfigGetReadSaved
return inhaleDeviceConfig(getconfigCtx, config, savedConfig), rv.TracedReqs

return configOK, rv.TracedReqs
}
}
}
Expand Down Expand Up @@ -643,27 +645,67 @@ func getLatestConfig(getconfigCtx *getconfigContext, url string,
return invalidConfig, rv.TracedReqs
}

cfgRetval := configOK
if !changed {
log.Tracef("Configuration from zedcloud is unchanged")
// Update modification time since checked by readSavedConfig
touchReceivedProtoMessage()
goto cfgReceived
}

cfgRetval = inhaleDeviceConfig(getconfigCtx, config, fromController)
if cfgRetval != configOK {
log.Errorf("inhaleDeviceConfig failed: %d", cfgRetval)
return cfgRetval, rv.TracedReqs
}

// Inform ledmanager about config received from cloud
utils.UpdateLedManagerConfig(log, types.LedBlinkOnboarded)
getconfigCtx.ledBlinkCount = types.LedBlinkOnboarded

if !getconfigCtx.configReceived {
getconfigCtx.configReceived = true
}
getconfigCtx.configGetStatus = types.ConfigGetSuccess
publishZedAgentStatus(getconfigCtx)

if !changed {
log.Tracef("Configuration from zedcloud is unchanged")
// Update modification time since checked by readSavedConfig
touchReceivedProtoMessage()
return configOK, rv.TracedReqs
}

// Save configuration wrapped in AuthContainer.
saveReceivedProtoMessage(authWrappedRV.RespContents)

return inhaleDeviceConfig(getconfigCtx, config, fromController), rv.TracedReqs
cfgReceived:
getconfigCtx.configReceived = true

return configOK, rv.TracedReqs
}

// Returns true if attempt to get a configuration has failed, but initial
// configuration was received (either from the controller, either successfully
// read from the file)
func needRequestLocConfig(getconfigCtx *getconfigContext,
rv configProcessingRetval) bool {

return (rv != configOK && getconfigCtx.locConfig != nil)
}

func getLatestConfig(getconfigCtx *getconfigContext, iteration int,
withNetTracing bool) (configProcessingRetval, []netdump.TracedNetRequest) {

url := zedcloud.URLPathString(serverNameAndPort, zedcloudCtx.V2API,
devUUID, "config")

rv, tracedReqs := requestConfigByURL(getconfigCtx, url,
iteration, withNetTracing)

// Request configuration from the LOC
if needRequestLocConfig(getconfigCtx, rv) {
locURL := getconfigCtx.locConfig.LocURL
url = zedcloud.URLPathString(locURL, zedcloudCtx.V2API, devUUID, "config")

// If LOC configuration is outdated, then we get @obsoleteConfig
// return value (see parseConfig() for details) and we repeat on
// the next fetch attempt
rv, tracedReqs = requestConfigByURL(getconfigCtx, url,
iteration, withNetTracing)
}

return rv, tracedReqs
}

func saveReceivedProtoMessage(contents []byte) {
Expand Down Expand Up @@ -863,7 +905,7 @@ func inhaleDeviceConfig(getconfigCtx *getconfigContext, config *zconfig.EdgeDevC
if controllerEpoch != newControllerEpoch {
log.Noticef("Controller epoch changed from %d to %d", controllerEpoch, newControllerEpoch)
controllerEpoch = newControllerEpoch
triggerPublishAllInfo(getconfigCtx.zedagentCtx)
triggerPublishAllInfo(getconfigCtx.zedagentCtx, AllDest)
}
}

Expand Down
4 changes: 2 additions & 2 deletions pkg/pillar/cmd/zedagent/handlecontent.go
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ func handleContentTreeStatusImpl(ctxArg interface{}, key string,
status := statusArg.(types.ContentTreeStatus)
ctx := ctxArg.(*zedagentContext)
uuidStr := status.Key()
PublishContentInfoToZedCloud(ctx, uuidStr, &status, ctx.iteration)
PublishContentInfoToZedCloud(ctx, uuidStr, &status, ctx.iteration, AllDest)
ctx.iteration++
}

Expand All @@ -150,6 +150,6 @@ func handleContentTreeStatusDelete(ctxArg interface{}, key string,

ctx := ctxArg.(*zedagentContext)
uuidStr := key
PublishContentInfoToZedCloud(ctx, uuidStr, nil, ctx.iteration)
PublishContentInfoToZedCloud(ctx, uuidStr, nil, ctx.iteration, AllDest)
ctx.iteration++
}
78 changes: 40 additions & 38 deletions pkg/pillar/cmd/zedagent/handlelocation.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ const (
)

// Run a periodic post of the location information.
func locationTimerTask(ctx *zedagentContext, handleChannel chan interface{}) {
func locationTimerTask(ctx *zedagentContext, handleChannel chan interface{},
triggerLocationInfo chan destinationBitset) {
var cloudIteration int

// Ticker for periodic publishing to the controller.
Expand Down Expand Up @@ -59,28 +60,11 @@ func locationTimerTask(ctx *zedagentContext, handleChannel chan interface{}) {
for {
select {
case <-cloudTicker.C:
locInfo := getLocationInfo(ctx)
if locInfo == nil {
// Not available.
break
}
start := time.Now()
cloudIteration++
publishLocationToController(locInfo, cloudIteration)
ctx.ps.CheckMaxTimeTopic(wdName, "publishLocationToController", start,
warningTime, errorTime)

publishLocation(ctx, &cloudIteration, wdName, ControllerDest)
Comment on lines 62 to +63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milan-zededa don't we want to publish periodically to AllDest here? (and, if so, rename the tcloudTicker variable)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current code in the PR we only publish location to the LOC when we publish all info which is when the controller epoch changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should publish location with the appTicker below. That on is firing at a higher frequency (timer.location.app.interval = 20secs) than cloudTicker (timer.location.cloud.interval = 1 hour). The idea is that LPS (and probably also LOC) are much closer networking wise and under much less stress than zedcloud, so they should be able to show more frequent location updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But see below - I might have missed a new use of triggerPublishAllInfo(). Makes sense to add comments about that new usage here resulting in publishing the location and check that the comment for triggerPublishAllInfo isn't stating something misleading abouts its usage.

Copy link
Contributor Author

@rouming rouming Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand comments right. The default behavior is unchanged. If not - this is a bug.

case dest := <-triggerLocationInfo:
publishLocation(ctx, &cloudIteration, wdName, dest)
case <-appTicker.C:
locInfo := getLocationInfo(ctx)
if locInfo == nil {
// Not available.
break
}
start := time.Now()
publishLocationToLocalServer(ctx.getconfigCtx, locInfo)
ctx.ps.CheckMaxTimeTopic(wdName, "publishLocationToLocalServer", start,
warningTime, errorTime)

publishLocation(ctx, &cloudIteration, wdName, LPSDest)
case <-stillRunning.C:
}
ctx.ps.StillRunning(wdName, warningTime, errorTime)
Expand Down Expand Up @@ -121,8 +105,31 @@ func updateLocationAppTimer(ctx *getconfigContext, appInterval uint32) {
flextimer.TickNow(ctx.locationAppTickerHandle)
}

func publishLocationToController(locInfo *info.ZInfoLocation, iteration int) {
log.Functionf("publishLocationToController: iteration %d", iteration)
func publishLocation(ctx *zedagentContext, iter *int, wdName string,
dest destinationBitset) {
locInfo := getLocationInfo(ctx)
if locInfo == nil {
// Not available.
return
}
if dest&(ControllerDest|LOCDest) != 0 {
*iter++
start := time.Now()
publishLocationToDest(ctx, locInfo, *iter, dest)
ctx.ps.CheckMaxTimeTopic(wdName, "publishLocationToDest", start,
warningTime, errorTime)
}
if dest&LPSDest != 0 {
start := time.Now()
publishLocationToLocalServer(ctx.getconfigCtx, locInfo)
ctx.ps.CheckMaxTimeTopic(wdName, "publishLocationToLocalServer", start,
warningTime, errorTime)
}
}

func publishLocationToDest(ctx *zedagentContext, locInfo *info.ZInfoLocation,
iteration int, dest destinationBitset) {
log.Functionf("publishLocationToDest: iteration %d", iteration)
infoMsg := &info.ZInfoMsg{
Ztype: info.ZInfoTypes_ZiLocation,
DevId: devUUID.String(),
Expand All @@ -132,14 +139,11 @@ func publishLocationToController(locInfo *info.ZInfoLocation, iteration int) {
AtTimeStamp: ptypes.TimestampNow(),
}

log.Functionf("publishLocationToController: sending %v", infoMsg)
log.Functionf("publishLocationToDest: sending %v", infoMsg)
data, err := proto.Marshal(infoMsg)
if err != nil {
log.Fatal("publishLocationToController: proto marshaling error: ", err)
log.Fatal("publishLocationToDest: proto marshaling error: ", err)
}
infoURL := zedcloud.URLPathString(serverNameAndPort, zedcloudCtx.V2API,
devUUID, "info")

buf := bytes.NewBuffer(data)
if buf == nil {
log.Fatal("malloc error")
Expand All @@ -148,15 +152,13 @@ func publishLocationToController(locInfo *info.ZInfoLocation, iteration int) {

const bailOnHTTPErr = false
const withNetTrace = false
ctxWork, cancel := zedcloud.GetContextForAllIntfFunctions(zedcloudCtx)
defer cancel()
rv, err := zedcloud.SendOnAllIntf(ctxWork, zedcloudCtx, infoURL,
size, buf, iteration, bailOnHTTPErr, withNetTrace)
if err != nil {
// Hopefully next timeout will be more successful
log.Errorf("publishLocationToController: failed (status %d): %v", rv.Status, err)
return
}
key := "location:" + devUUID.String()

// Even for the controller destination we can't stall the queue on error,
// because this is recurring call, so set @forcePeriodic to true
forcePeriodic := true
queueInfoToDest(ctx, dest, key, buf, size, bailOnHTTPErr, withNetTrace,
forcePeriodic, info.ZInfoTypes_ZiLocation)
}

func publishLocationToLocalServer(ctx *getconfigContext, locInfo *info.ZInfoLocation) {
Copy link
Contributor

@milan-zededa milan-zededa Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are now also refactoring a bit, would you rename this to publishLocationToLPS? Also any other occurrence of LocalServer can be renamed to LPS to avoid confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milan-zededa Sure? 'LocalServer' is used in many places all over the code. I can, but won't this only increase confusion if I change only the function name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind if you renamed all of it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR. Here I offer only one function rename! :)

Expand Down
Loading