-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[incremental learning] example:keep all results whether is hardExample or not, fixed the issue of using s3 to save model #107
Conversation
whose hard coefficient is less than | ||
threshold_cross_entropy. And its default value is | ||
threshold_cross_entropy=0.5 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 0.5 the optimal value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on that?
""" | ||
|
||
def __init__(self, threshold_img=0.5, threshold_box=0.5, **kwargs): | ||
self.threshold_box = float(threshold_box) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question above. suggest to give optimal value .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on that?
|
||
os.environ["AWS_ACCESS_KEY_ID"] = os.getenv("ACCESS_KEY_ID", "") | ||
os.environ["AWS_SECRET_ACCESS_KEY"] = os.getenv("SECRET_ACCESS_KEY", "") | ||
os.environ["S3_ENDPOINT"] = url.netloc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't use ACCESS_KEY_ID
and ACCESS_KEY_ID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the specified key in the framework(tensorflow)
@@ -28,6 +28,7 @@ | |||
|
|||
os.environ['BACKEND_TYPE'] = 'TENSORFLOW' | |||
LOG = logging.getLogger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constant TENSORFLOW
in constant.py
, e.g., constant.backend_type.TENSORFLOW='TENSORFLOW'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do developers need to know constants in lib?
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""Hard Example Mining Algorithms""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to delete this annotation, this doesn't meet the annotation specification
|
||
from sedna.common.class_factory import ClassFactory, ClassType | ||
|
||
__all__ = ('ThresholdFilter', 'CrossEntropyFilter', 'IBTFilter') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these constants put in constant.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's standard usage in python, used to restrict the import of modules.
09bf46a
to
b95134c
Compare
26cdcb3
to
bd863f9
Compare
@@ -30,6 +31,19 @@ | |||
Session = tf.Session | |||
|
|||
|
|||
_url = os.getenv("S3_ENDPOINT_URL", "http://s3.amazonaws.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- please put these s3 codes into a function instead of top level
- why only does tensorflow have this config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a114ba6
to
bd863f9
Compare
- Add docs and code comment - fix bugs: epoch always be 1, inference result not saved, s3 upload fail Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
1. The algorithm of HardExampleMining should be seleted by the developer. Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
|
||
os.environ["AWS_ACCESS_KEY_ID"] = os.getenv("ACCESS_KEY_ID", "") | ||
os.environ["AWS_SECRET_ACCESS_KEY"] = os.getenv("SECRET_ACCESS_KEY", "") | ||
os.environ["S3_ENDPOINT"] = s3_url.netloc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not env ACCESS_KEY_ID, os.getenv("ACCESS_KEY_ID") will be None,
os.environ["AWS_ACCESS_KEY_ID"] = None will report error
suggest os.environ.setdefault()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix env missing bug Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
18bc465
to
e833a23
Compare
@@ -100,10 +105,29 @@ def deal_infer_rsl(model_output): | |||
def run(): | |||
camera_address = Context.get_parameters('video_url') | |||
|
|||
hard_example_name = Context.get_parameters('HEM_NAME', "IBT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a developer find the env 'HEM_NAME'? If this is a sedna system environment varibles, it should provide a function or class for the developer, instead of expose an environment name which the developer might find only in documents.
In sedna, there are two seperated parameter entries for deployer and developer, and the parameter should be subject to CRD or LIB.
for example:
# Select one of the next two lines (ignore the name of function or varibles):
ibt=IL.get_hem_from_crd() # subject to CRD
ibt=IL.IBT(0.9,0.9) # subject to LIB
instance = IL(
estimator=e,
hem=ibt
)```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an interface that needs to be modified
00b310a
to
99b6cb3
Compare
99b6cb3
to
94301c0
Compare
/assign @jaypume |
} | ||
# get hard exmaple mining algorithm from config e.g.: | ||
# {"method": "IBT", "param": {"threshold_img": 0.9}} | ||
hard_example_mining = IncrementalLearning.get_hem_algorithm( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to add a note here, it's redundant, how about to rename get_hem_algorthm
to get _hem_algorithm_from_config
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
lib/sedna/common/config.py
Outdated
@@ -291,3 +291,27 @@ def get_parameters(cls, param, default=None): | |||
value = cls.parameters.get( | |||
param) or cls.parameters.get(str(param).upper()) | |||
return value if value else default | |||
|
|||
@classmethod | |||
def get_crd_algorithm(cls, algorithm, **param) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crd
is a very special term, only work in kubernetes, algorithm developer shouldn't understand it. rihgt? how about to rename it get_algorithm_from_api
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
94301c0
to
cdcfb57
Compare
Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
cdcfb57
to
d6023c3
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jaypume The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Upgrade to v0.4.0 Created-by: Vittorio Cozzolino 00609018 Author-id: 553076 MR-id: 12361350 Commit-by: Vittorio Cozzolino;KubeEdge Bot;JoeyHwong;JimmyYang20;ShiXiaohou;DanLiu;HenryChou;Yutong Wang;EnfangCui;llhuii;XinYao1994;Jie Pu;wei.ji Merged-by: Vittorio Cozzolino 00609018 E2E-issues: Description: fix incremental_learning bug - Add docs and code comment - fix bugs: epoch always be 1, inference result not saved, s3 upload fail Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Interface Improvement: 1. The algorithm of HardExampleMining should be seleted by the developer. Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, install.sh: fix LC_BIND_PORT bug rename the variable LC_BIND_PORT to SENDA_LC_BIND_PORT. Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#121 from llhuii/fix-install-script-bug install.sh: fix LC_BIND_PORT bug, Update interface.py fix env missing bug Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, fix bug: aggregation of weights should occur in the AggServer Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, fix bug: Cloud worker not exiting Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, gm: refactor all features into independent dir All controllers are placed into globalmanager/controllers: 1. each feature has the independent subdirectory 2. upstream/downstream are kept as top level. Commom types/utils/worker.go are placed into globalmanager/runtime. Signed-off-by: llhuii <liulinghui@huawei.com>, fix PR comment - clean useless code - catch server exception in threads Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, gm: refactor upstream controller Split upstream controller, merge each feature CR logic code into its controller. Signed-off-by: llhuii <liulinghui@huawei.com>, gm: share client/Informer with all controllers Make all controllers sharing with: 1. kubernetes client, and informerFactory with random resync period. 2. sedna crd client, and informerFactory with random resync period. This can reduce code and improve slim performance. Signed-off-by: llhuii <liulinghui@huawei.com>, gm: add dataset controller Only handle dataset update from edge. Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#106 from JoeyHwong-gk/federated fix federated learning bugs, gm: split all upstream logic into separate file Signed-off-by: llhuii <liulinghui@huawei.com>, gm: split all downstream logic into separate file Since all CR watch actions are placed into corresponding controller, controllers/downstream.go is unnecessary. Signed-off-by: llhuii <liulinghui@huawei.com>, LC: fix nil pointer dereference bug It happened in evalTask of incremental job when deploy model hasn't been synced to LC. evalTask should return error instead of logging error. And it doesn't need job id info into error, same as trainTask. Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#139 from JimmyYang20/fixbug LC: fix nil pointer dereference bug, LC: send dataset update to GM only when changed number of samples has been sent to GM only when adding new data. Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#137 from JimmyYang20/main LC: send dataset update to GM only when changed, lifelong learning s3 support - fix file_ops method - fix kb save bug Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Add object search and tracking docs to docs/proposals/ Add object search and tracking crd samples to build/crd-samples/sedna/ Add object search and tracking type.go files to pkg/apis/sedna/v1alpha1/ Signed-off-by: EnfangCui <17111008@bjtu.edu.cn>, Merge pull request kubeedge#100 from EnfangCui/add-multi-edge-inference-PR Add object search and tracking proposals, gm: more code clean after initial refactor done 1. remove the feature redundant name in all feature controllers(e.g. 'federatedlearningJob' to 'job'), since it has already own independent package, no need the feature extra name 2. upstream interface optimizaztion 3. fix empty Kind of all CR in downstream 4. add extra doc string 5. fix code style Signed-off-by: llhuii <liulinghui@huawei.com>, Fix the problem that kbimage cannot be compiled in Makefile Signed-off-by: wei.ji <wei.ji@easystack.cn>, improve lifelong learning docs 1. improve the atc example words 2. fix the broken links in lifelong proposal Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#146 from Jw-Jm/main Fix the problem that kbimage cannot be compiled in Makefile, make the hard_example_mining alg to be a common interface Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#134 from llhuii/refactor-gm gm: decouple all features into independent package, Merge pull request kubeedge#107 from JoeyHwong-gk/incremental [incremental learning] example:keep all results whether is hardExample or not, fixed the issue of using s3 to save model, Merge pull request kubeedge#143 from JoeyHwong-gk/lldoc improve lifelong learning docs, fix example bug: save result which get from cloud if is hard example fix message when http connect fail Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, fix pr comment - make the hard_example_mining alg to be a common interface - fix get_hem_from_config: raise exception when value is unexpected Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, lc: decouple all features into independent package Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#117 from JoeyHwong-gk/joint joint_inference: bug fix and interface reconstruction, Merge pull request kubeedge#149 from JimmyYang20/refector-lc lc: decouple all features into independent package, fix lifelong issue - Reduce parameters for initial - show all interfaces of lifelong learning in example - fix bugs from deep learning framework Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, fix il doc Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#153 from JimmyYang20/fix-doc Fix rendering issue of example doc in readthedocs, Merge pull request kubeedge#142 from JoeyHwong-gk/lls3 lifelong learning: issue-driven interface-adjustment and bug fix, fix the lifelong example problem from backend and constant - fix sklearn backend: support args in train/eval/infer - fix lifelong constant Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Automatic push images when publishing a release A github action is added for pushing image when a new release is created: 1. login docker hub. 2. checkout the project, and run `make push-all`. Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#154 from JoeyHwong-gk/lifelong [Lifelong example]: fix the problem from backend and constant, docs: update install guide 1. add GM/LC links 2. add GM/LC deploy form Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#155 from llhuii/add-image-push-gh-action Push images automatically when a new release is created, Merge pull request kubeedge#156 from llhuii/update-install-doc docs: update install guide, IL: LC supports to recover job when restart Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#152 from JimmyYang20/fixbug IL: LC supports to recover job when restart, Fix IMAGE_REPO in github image-publish action Using the env 'GITHUB_REPOSITORY' instead of 'GITHUB_ACTOR' to get the right image repo name i.e. `IMAGE_REPO` in Makefile. Signed-off-by: llhuii <liulinghui@huawei.com>, add lib doc Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Improve the docs Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, fix syntax and information in the docs Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, update lib doc Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Update s3 example docs of IL&JI Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Support websocket reconnection Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Lib support hot model update Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Adjusting the Log of IncrementalLearning example Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Fix codegen verify checker Note the codegen verify checker should report error Signed-off-by: llhuii <liulinghui@huawei.com>, Add the missing gencode for objectsearch/tracking Signed-off-by: llhuii <liulinghui@huawei.com>, fix job_kind value in LC_report Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Update dependency of server request in lib - replace `retry==1.3.3` with `tenacity==8.0.1` because of `retry` no longer maintained. Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#158 from llhuii/fix-imagerepo-of-image-publish-action Fix IMAGE_REPO in github image-publish action, Merge pull request kubeedge#164 from JoeyHwong-gk/federated Support websocket reconnection when the server status is abnormal, Merge pull request kubeedge#166 from llhuii/fix-verify-checker Fix codegen verify checker, Add contributing docs Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#159 from JoeyHwong-gk/libdoc update lib doc, Merge pull request kubeedge#148 from llhuii/add-contributing-docs Add contributing docs, Merge pull request kubeedge#160 from JimmyYang20/doc-s3 Update s3 example docs of IL&JI, fix access exceptions when rendering with sphinx Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#150 from JoeyHwong-gk/docs docs improvement, GM&LC: IL supports model hot updates Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, fix pr comment Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#138 from JimmyYang20/modelhotupdate GM&LC: IL supports model hot updates, Fix s3 example docs of IL&JI Signed-off-by: JimmyYang20 <yangjin39@huawei.com>, Merge pull request kubeedge#174 from JimmyYang20/doc-s3 Fix s3 example docs of IL&JI, Merge pull request kubeedge#157 from JoeyHwong-gk/hot_model [Lib Support] hot model update, Upgrade gorilla/websocket from v1.4.0 to v1.4.2 This upgrade fixes a potential DoS vector bug in gorilla/websocket 1.4.0, see GHSA-jf24-p9p9-4rjh Signed-off-by: llhuii <liulinghui@huawei.com>, Merge pull request kubeedge#182 from llhuii/upgrade-websocket Upgrade gorilla/websocket from v1.4.0 to v1.4.2, fix lib/requirements - This upgrade fixes a CSRF error in FastAPI version earlier than 0.65.2, see GHSA-8h2j-cgx8-6xv7 Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>, Merge pull request kubeedge#183 See merge request butterfly/sedna!9
Signed-off-by: JoeyHwong joeyhwong@gknow.cn